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ABSTRACT 

We present a numerical code for simulating the evolution of astrophysical systems using particles 
to represent the underlying fluid flow. The code is written in Fortran 95 and is designed to be 
versatile, flexible and extensible, with modular options that can be selected either at the time the 
code is compiled or at run time through a text input file. We include a number of general purpose 
modules describing a variety of physical processes commonly required in the astrophysical community 
and we expect that the effort required to integrate additional or alternate modules into the code will 
small. In its simplest form the code can evolve the dynamical trajectories of a set of particles in two 
or three dimensions using a module which implements either a Leapfrog or Runge-Kutta-Fehlberg 
integrator, selected by the user at compile time. The user may choose to allow the integrator to 
evolve the system using individual timesteps for each particle or with a single, global time step for 
all. Particles may interact gravitationally as A^-body particles, and all or any subset may also interact 
hydrodynamically, using the Smoothed Particle Hydrodynamic (SPH) method by selecting the SPH 
module. A third particle species can be included with a module to model massive point particles 
which may accrete nearby SPH or iV-body particles. Such particles may be used to model, e.g., stars 
in a molecular cloud. Free boundary conditions are implemented by default, and a module may be 
selected to include periodic boundary conditions. We use a binary 'Press' tree to organize particles 
for rapid access in gravity and SPH calculations. Modules implementing an interface with special 
purpose 'GRAPE' hardware may also be selected to accelerate the gravity calculations. If available, 
forces obtained from the GRAPE coprocessors may be transparently substituted for those obtained 
from the tree, or both tree and GRAPE may be used as a combination GRAPE/tree code. The code 
may be run without modification on single processors or in parallel using OpenMP compiler directives 
on large scale, shared memory parallel machines. We present simulations of several test problems, 
including a merg er simula t ion o f two elliptical galaxies with 800000 particles. In comparison to the 
Gadget-2 code of lSpringel (|2005( l. the gravitational force calculation, which is the most costly part of 
any simulation including self-gravity, is ~ 4.6 — 4.9 times faster with VINE when tested on different 
snapshots of the elliptical galaxy merger simulation when run on an Itanium 2 processor in an SGI 
Altix. A full simulation of the same setup with 8 processors is a factor of 2.91 faster with VINE. The 
code is available to the public under the terms of the Gnu General Public License. 
Subject headings: methods: numerical — methods: A-body simulations — galaxies: interactions 



1. INTRODUCTION 

In modern astrophysics, the numerical simulation of 
systems whose complexity is beyond the capabilities of 
analytical models has become a widely used tool. On 
nearly all length scales, ranging from problems on cos- 
mological distances, to galaxy formation and evolution to 
star and planet formation, numerical simulations have 
contributed much to our current understanding of the 
physical processes which lead to the universe we observe. 

The numerical simulation of a self-gravitating and/or 
gas dynamical system is a basis common to all those 
problems, no matter what length scale they belong to. 
The simulation techniques for such systems can be di- 
vided into two different approaches: grid based methods 
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divide space into finite sized cells and compute the phys- 
ical quantities such a s temperature, pressure, etc., inside 
those cells (see e.g. IStone k, NormanI Il992t iRvu et aTl 
ll993t lO'Shea et al.l l2004 and references therein). Parti- 
cle based methods represent a system by a set of par- 
ticles to which physical quantities such as mass, po- 
sition and velocity are assi gned or computed (see e.g. 
Hernquist fc Katzjn 989: Da ve et al.lll997l : iSpringel etall 
2OOI1: Wadslev et al.li2004.Springe]||2005l and references 



therein). Which approach is best for modeling a partic- 
ular system depends both on the problem to be mod- 
eled and the biases of the researcher doing the model- 
ing. Without going into the details of relative merits 
and shortcomings of either approach, we point out that 
for some problems, a grid based approach may be nearly 
unfeasible because of the existence of irregular bound- 
aries. Large voids can also be problematic for a grid 
based simulation because they require a large number of 
empty or nearly empty and uninteresting zones be in- 
cluded, at a high computational expense. CoUisionless 
systems are typically only modelled with particle repre- 
sentations. Although these can be modelled using a grid 
method (i.e. 'Particle In Cell', or PIC, methods) to cal- 



2 



Wetzstein et al. 



culate a long range force such as gravity, these systems 
are often modeUed using tree-based or direct summation 
forces. A particle based simulation naturally concen- 
trates the computational work in the most interesting 
areas of the flow, in most cases a very valuable feature, 
but the absence of particles in voids can be problematic 
if the simulation requires such low d ensity regions to be 
resolved at high accuracy (see e.g. lO'Shea et all [2005L 
for a recent comparison of grid and particle based codes 
for cosmological simulations). It may also suffer from 
a relatively poorer reproduction of the fluid behavior at 
shocks. 

For a system evolving only under the influence of grav- 
ity, a particle based approach leads to the classical A^- 
body problem. A set of Np particles evolve according 
to the force on each particle exerted by all the others. 
The efficient computation of these forces is a longstand- 
ing numerical problem. If gas dynamics is al so required , 
the Smoothed Particle Hydrodynamic (SPH, [Luc\ill977t 
iGingold fc Monas-hanlfTOTTl : lBendll99nl : lMonaghanlfT99l 
method has achieved great success at incorporating such 
processes into the framework of a particle method. In 
SPH, the gas is also represented using particles (which 
makes the method so useful in combination with A'^-body 
methods), which are assumed to sample the local hydro- 
dynamic quantities of the underlying flow. In addition to 
a position and velocity, these particles also possess an in- 
ternal energy intrinsic to each and a volume (or surface) 
density that is reconstructed for each particle, based on 
the positions of nearby 'neighbor' particles. Thus, these 
gas particles feel not only the gravitational forces that 
all particles in the simulation do, but also pressure forces 
and other gas dynamical effects. 

Modeling dynamical and hydrodynamical systems us- 
ing particles relies on the sufficiently accurate computa- 
tion of both the gravitational and hydrodynamical forces 
of the particles on each other, then advancing them for- 
ward in time according to those forces. Thus a time 
integration of the particles' equations of motion and ad- 
ditional equations for hydrodynamic quantities such as 
internal energy, is the problem to be solved. Constraints 
on the time integration arc that we would like it to faith- 
fully reproduce the evolution of the real system that the 
simulation is supposed to model, and that it do so ef- 
ficiently, so that results may be obtained quickly and 
insight into the physical world gained at a minimum of 
cost. 

In this paper and a companion 

(|Nelson. Wetzstein. fc NaabI l2008l hereafter Vine2), 
we describe a numerical code for efficiently simulating 
the evolution of astrophysical systems using A^-body 
particles, with the optional additions of including gas 
dynamical effects using the SPH method, self gravity 
and additional massive 'star' particles which may 
accrete the other species of particles. We call this code 
VINE. The present paper describes the physics we 
have implemented, the high level code design and the 
results of simulations using the code on a number of test 
pr oblems a s well as a comparison to the Gadget-2 code 
of ISpringeil (|2005[ ). Vine2 describes the low level design 
and optimization of the most computationally expensive 
parts of the code, the methods used to parallelize it and 
the performance of each part in serial and in parallel. 
VINE has been succesfuUy used for a large series of sim- 



ulations of galaxy interactions (iNaab fc BurkertI 120031: 
Jesseit et all 120051: iBurkert fc NaabI l2005t iDasvra et all 
2006allbl: INaab et all l2006aUbl: INaab fc Truiillol l200a 



2006allbl: iNaab et all l2006allb l: iNaab fc Truiillol 
Bell e t al..l2006l : lThomas et al.l f2p07: Burk ert et all 
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Naab et all l2007l : iJcsseit ct al. 2007; Wetzstein et al l 
2007), turbulence studies (Kitsionas et al. 2008). as well 



as planet formation (e.g. iNelsonI l2006f ). In addition, 
VINE has been extende d with an impleme ntation of 
ionizing radiation in SPH iGritschneder et all (2009a,b). 

In section|5|we describe the implementation of two sec- 
ond order integration methods included with the code, 
and a discussion of the criteria used to determine the 
time steps used to evolve the particles forward in time. 
We describe in sections |3| and SI respectively, the form of 
the equations used to implement the SPH method and 
the different options for the calculation of gravitational 
forces. In section [SJ we describe the implementation of 
'star' particles which can accrete the A^-body and SPH 
particle species. The boundary conditions available in 
the code are discussed in section |6l In section |7| we 
demonstrate some of the capabilities of the code on sev- 
eral test simulations, including both SPH as well as a 
pure A^-body problems. In section |8| the performance 
of VINE is co mpared to that of the Gadget-2 code of 
'Springe]! (gOOl). Finally, in section |9l we summarize the 
features of our code and give web sites where the source 
code may be obtained electronically. 

2. TIME INTEGRATION 

In order to simulate the evolution of a physical system 
using a set of A^p particles, we require first a set of equa- 
tions by which the system evolves and second a method 
for integrating those equations forward in time. In the 
case of particle systems involving only gravitational in- 
teractions ('A^-body' simulations), the equations form a 
set of coupled first order differential equations governing 
the motion of those particles in response to each other 
and (if present) to outside influences. This set consists 
of the equations describing the motion of each particle, 

cbcj _ 

dt ~ 

and the equations for momentum conservation 
dvi _ 
dt rrii 



(1) 



(2) 



The solution of these equations is difficult because each 
of the ANp or QNp equations (in 2 or 3 dimensions) is 
coupled through the gravitational potential, $. Other 
coupling terms must be added in cases where other phys- 
ical phenomena such as hydrodynamics (section [3]) are 
active, and may require additional equations be solved. 
For the case of hydrodynamic systems, we must solve 
not only a momentum equation appropriately general- 
ized from above, but also mass and energy conservation 
equations. 

A variety of methods for integrating differential 
equations are available (s e e e.g. the textbooks of 
iHocknev fc Eastwood [l98l lFletcheilll997f ) . with vary- 
ing degrees of utility for any given problem. Determin- 
ing which method is most efficient is highly dependent 
on the characteristics of the problem itself. Most astro- 
physical systems, for example, develop highly non-linear 
flow patterns with the practical consequence that high 
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order integration methods (i.e. those with a mathemati- 
cal truncation error proportional to a high power of the 
integration step size) are not generally useful. The non- 
linearities mean that time steps must be restricted to 
very small sizes in order to resolve the flow, while the 
high order integration requires many derivative calcula- 
tions per timestep, yielding a very high computational 
cost to evolve a system for a given amount of time. Even 
among integrators of identical 'order', characteristics of 
the errors that develop can vary, with one being inap- 
propriate for use on a problem another might be ideally 
suited to solve. 

In order to allow the user enough flexibility to de- 
termine what is best for a specific problem, we have 
implemented both a second order Runge-Kutta scheme 
(fFehlberg 1968) and a s econd order leapfrog s cheme 
fsee e.g.lHockneY fc Hohl|[r9 69: Hcrnquist & Katj |1989l: 
iRasio fc Shapirolll99lHSpri^gel et al..,200Il in VINE. Al- 
though very different in structure (leapfrog requires only 
one force computation per time step for example, while 
the Runge-Kutta scheme requires two), users may trans- 
parently select one or the other at the time the code is 
compiled. Both integrators are very modular in the sense 
that they use the same bookkeeping scheme for particles 
and their timesteps, and identical calls to update rou- 
tines. If a user finds that still a different choice of inte- 
grator is required, we expect that it would be straight- 
forward to add it as an alternative as well. 

2.1. The Runge-Kutta-Fehlberg (RKF) Integrator 

Runge-Kutta schemes of a variety of forms have been 
developed since the original publ ication of th e general 
method more than 100 years ago (|Kuttalll90lD . but un- 
til the work of .Fehlberg (1968) they included no formal 
description of the size of errors that developed during 
an integration. Fehlberg realized that a Runge-Kutta 
scheme of a given truncation order could be embedded 
in a similar scheme of one order higher, given a suit- 
able choice of coefficients for both. The resulting pair 
of methods, used together, could be used to determine a 
limit on the size of the next order truncation error for the 
lower order scheme in the pair. In VINE, we have im- 
plemented the first order scheme with second order error 
control ('RKF1(2)'), as described by Fehlberg (1968). A 
brief description of the scheme is summarized here. 

For any quantity, g, to be integrated in time, the quan- 
tity at the new time t"+^ is computed from its value 
g" at the previous time t" utilizing the discretization: 

g"+l =g« + (cofco + Clfci)Ar (3) 

where the Ci are constant parameters and the ki are time 
derivatives of g evaluated at various points during the 
timestep: 

fco = g(i",g"), (4) 

/ci = g(i" + aiAr,g"+/3iofcoAr) (5) 
fc2 = g(r + aaAt", g" + /JaofcoAt" + /Saifci Af"). (6) 

where g is the time derivative of g and At" is the time 
step from t" to t"+^. The fc2 term does not appear di- 
rectly in the integration equation [3] above, but does ap- 
pear in the error criterion defined in section [2. 3. 21 below. 
The coefficients ak and j3ki and Ck defined by Fehlberg 
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are reproduced in table 12.11 By definition of the coeffi- 
cients, the /c2 term is identical to the feg term for the fol- 
lowing timestep, reducing the number of derivative evalu- 
ations to two per timestep. The Ck define the coefficients 
used by the first order RK scheme, while the Ck terms 
define the coefficients used in the second order scheme 
used only indirectly to define the truncation error. 

2.2. The Leapfrog Integrator 

The leapfrog (LF) integration scheme is formally an 
offset integrator: positions and velocities are offset 
from each other in time b y half a time step (see e.g. 
iHocknev fc EastwoodlllQSH ). Alternate updates of posi- 
tion and velocity advance from one half step behind to 
one half step ahead of the other update in the sequence, 
effectively 'leapfrogging' over each other in the integra- 
tion scheme, which takes its name by analogy from the 
children's game. Th e leapfrog impl e menta tion in VINE 
is similar to that of ISpringel et al.l (|2001f l. for which a 
mathematically equivalent form is used, in which the 
equations for the positions and velocities are written in 
a non-offset form as 

v"+i=v" + a"+i/2At" (7) 
x"+i = x" + i (v" + v"+^) At" (8) 

where again indices n, n-|-l/2 and n-\-\ denote quantities 
at time t", t"+i/2 and t"+^, respectively, and At" is the 
step from n to n -f 1. To recover the offset form, notice 
that positions and accelerations are actually defined on 
half timesteps, but that the position update is effectively 
split into two halves. With a fixed increment At, each 
position update as defined in equation [8] uses the veloc- 
ity corresponding to two separate velocity updates, half 
from timestep n, v"/2, and half from timestep ri + 1, 
v"+^/2, so that, effectively, updates of position are only 
half completed at any 'full' time step n. 

The velocity update requires that accelerations be cal- 
culated on half steps, n + 1/2. For simulations involving 
self gravity and hydrodynamics, the accelerations depend 
on particle positions, so that a separate, temporary up- 
date of the position to its correctly offset temporal loca- 
tion is required. This update takes the form 

x"+i/2 ^ x" + iv"At" (9) 

as expected from the discussion above. Other quantities 
requiring integration, such as internal energy, smoothing 
lengths or viscous coefficients needed for hydrodynamic 
simulations (section [3|), are defined on integer time steps. 
Their derivatives must therefore be calculated on half 
time steps at the same time as the accelerations them- 
selves are calculated. Complications arise because for 
most such variables, the derivative is a function of the 
variable itself or of others defined on integer time steps. 
Two simple examples are of artificial viscosity or of PdV 
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work, each of which require velocity. VINE employs a 
linear extrapolation of each quantity from n to n + 1/2, 
as shown in equation [5] for position, so that the inte- 
gration scheme itself remains formally second order. In 
summary, the algorithm can be written as 

1. complete position update to x"+^/^, extrapolate 
other quantities as required. 

2. compute a"+^/^ and other derivatives. 

3. update velocities v" v"+^, using equation [71 
Update other relevant quantities using appropriate 
analogous update equations. 

4. update positions x" x"+^, using equation [S] 

After the fourth step the sequence starts anew. This 
variation is called a 'Drift-Kick-Drift' (DKD) leapfrog, 
because particles drift forward for half of a time step, 
undergo a force (the 'kick') calculation, then drift for- 
ward for an additional half step under the control of 
the newly determined force. An alternative ('Kick-Drift- 
Kick', or KDK) can be formulated as well, with posi- 
tions and velocities instead being defined at integer and 
half integer times. This variant has been shown to have 
somewhat bett er error properties than DKD leapfrog 
(|Springelll2005[ ). but has not yet been incorporated into 
VINE. 

Although slightly more cumbersome than the pure 
offset form, either of the leapfrog variants above are 
advantageous because adjustable time steps, such that 
Atn ^ At„-|_i, are straightforward to implement, as are 
individual time steps for different particles (see section 
12. 4p . Both features will be desirable in simulations of sys- 
tems where time scales vary widely as conditions change 
over time. The consequences for such adaptability is that 
the exact leapfrog symmetry between position and veloc- 
ity updates is lost, but changes between one time step 
value and another should be infrequent enough in prac- 
tice to make overall errors resulting from them small. 

2.3. Timestep Criteria 

In order to produce an accurate integration, time steps 
must be chosen that are small enough to maintain the 
stability of the system against the growth of errors. At 
the same time, time steps should not be much smaller 
than required to maintain stability and accuracy, be- 
cause that wastes computational resources that could be 
more efficiently employed in performing larger simula- 
tions. Here we describe the criteria used in VINE to 
determine timesteps for the particles. 

2.3.1. Time Step Criteria common to both the Leapfrog and 
RKF Integrators 

The timestep criteria described in this section apply 
to both the leapfrog and the RKF integrators. The next 
time step At"+^ of a particle i is determined by the min- 
imum of value derived from a set of N criteria: 

Ar+i =min(Ar^+i), (10) 

where we have suppressed the subscript, i, designating 
each particle. Whether or not to include a particular 
criterion may be selected by the user at compile time 



by commenting out (or not) calls to subroutines that 
calculate one or another of the At^^, and by routines 
active only when certain options are selected, such as the 
Runge-Kutta integrator or SPH (sections 12.3.21 and 13 .4p . 

Three simple criteria are based on changes in the ac- 
celeration of a particle: 




its velocity: 

AC'=r.eZ^, (12) 
or both in combination: 

^^va ~ '''va-, fj (13) 

|a| 

where h, a and v are the gravitational softening length, 
the acceleration and velocity of the particle i at the pre- 
vious time step, respectively, and the three values of r 
are tuning parameters for each criterion. Numerical ex- 
periments show that Tacc ~ 0.5 gives good results. When 
included, we use similar values for the other two toler- 
ance parameters as well. 

Although the combination of all three c riteria i s some- 
times useful, and indeed is used in e.g. [Nelson! |20Q6|) 
with VINE, in many cases it is sufficient to include only 
the acceleration based criterion of equation [Til allow- 
ing the others to be neglected. For example, when the 
velocity criteria are included they can impose very re- 
strictive constraints on the calculated time step. If a 
particle moves at very high velocities, equation [121 can 
require small time steps even when the particle does not 
change its trajectory and could otherwise be integrated 
with large time steps. Similarly, equation [T31 can limit 
the time step of a particle when it moves very slowly but 
feels only small forces. 

An additional time step criterion is also present in 
VINE, but is never explicitly calculated by the code ei- 
ther on a particle by particle basis or for all particles in 
aggregate. Instead, it is set by the user when a specific 
simulation's initial conditions are specified. Namely, the 
user must specify the maximum time step, Atmaxi per- 
mitted by the code, on which the hierarchy of time step 
bins is built. This value effectively serves as an additonal 
time step criterion because it limits the steps of parti- 
cles which might obtain too large time steps through the 
other criteria. Similar to the parameter values used for 
the integration accuracy in the time step criteria, its op- 
timal value chosen will be problem dependent. In VINE, 
it also serves an important secondary role, because VINE 
outputs checkpoint dumps for the simulation only on in- 
teger multiples of the maximum timestep. Therefore, for 
an analysis which requires high time resolution one might 
choose a different value than for another analysis which 
takes only the final state of the system as input. 

2.3.2. Time Step Criteria for the RKF Integrator 

In addition to the conditions above, the RKF inte- 
grator requires an additional criterion, which limits the 
second order truncation error in the discretization (see 
section 12.11) . As defined by Fehlberg, the second order 
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truncation error for integrating variable q through a time 
A<(*) win be: 

TE = C2{kn-k2)Ar, (14) 

where fco and fc2 are defined in equations[3]and[5]and £2 is 
defined in table 12.11 Unfortunately, the truncation error 
as defined is an absolute error. It depends on the units 
for a given variable as well as the size of the system and is 
therefore not particularly useful without explicit tuning 
for every variable, physical system and simulation. Vari- 
ous relative error metrics are straightforward to develop 
from equation [14] however. For example, we may define 
a relative error metric such that the truncation error is 
no larger than a small fraction of the magnitude of the 
variable itself: 

RE^^^, (15) 

TRKF|g| 

where we define trkf as a tunable parameter restricting 
the error, and we require RE to be < 1 for an acceptable 
error. 

We may expect the optimal timestep to be proportional 
to the square root of the truncation error, since the error 
itself is second order in time. Then we may determine 
a new time step from the old by comparing the ratio of 
the new and old values to the error metric: 

At-^^ = Atl^^./l/RE. (16) 

With this definition, the n + 1 time step will be decreased 
when the error for a given time step is large. If small, it 



will be increased. We define the final value of At 



Tl+l 

RKF 



to 



be the minimum over all integration variables defined in 
the simulation. 

Although often an improvement over the direct mea- 
sure of error and time step definition, equation [16] may 
still suffer from several deficiencies in practice, depending 
on the specific integration variable. For example, the size 
of the timestep calculated for positions depends on the 
position itself, and particles near the origin will necessar- 
ily receive more restrictive time steps than those further 
away. An arbitrary change of coordinate system, shift- 
ing the entire system some distance in any direction, will 
also change the error metric and time step calculation. 
For the same reasons, velocity errors and timesteps will 
suffer similar problems. 

A variety of strategies to sidestep undesirable proper- 
ties for one variable or another are available, including 
replacing q in equation [15] with its value subtracted from 
the system's average velocity or center of mass for ve- 
locity or position coordinates respectively, or adding a 
constant error softening value to eliminate singularities 
in the error near ze r os of the variable. Following discus- 
sion m iPress et all (|1992D . one may also replace \q\ for 
some variables with its value added to its change at the 
last timestep: 

q' ^\q\ + \Aeq\. (17) 

Alternate error metrics such as these have been imple- 
mented in VINE with some success on several systems we 
have studied, however in general, we expect that suitable 
error metrics will need to be worked out on a case by case 
basis by the user. Some small comfort may be had in the 
fact that, under most conditions, other error conditions 
are more restrictive than the RKF error, making ques- 
tions of the suitability of the form of the RKF criterion 
moot. 



2.3.3. Differences in settings for time step criteria when 
global or individual time steps are used, or for 
different problems 

For both of VINE's integrators, it is possible to check 
after each time step whether the integration over that 
time step met or failed the set of error criteria described 
above. If so, in principle one can revert the time step 
and repeat it with a smaller step size. In practice, re- 
version is only possible if the entire system is advanced 
using a single, global time step for all particles, and is in 
fact done in VINE when the global time step option is 
selected. When individual time steps are used (discussed 
in section [2^ below), reverting the time step is usually 
not possible because it requires keeping track of a large 
set of previous time steps for every particle in the system, 
and is therefore usually prohibitive in terms of memory 
as well as computational effort. Thus for an individual 
time step scheme, the criteria for choosing the next time 
step and their settings must be chosen more conserva- 
tively than with the global time step option, in order to 
ensure in advance that the time step is small enough to 
inte grate that particl e's properties correctly. For exam- 
ple, |Batii^F^lJ(|l995t) demonstrated that criteria similar 
to those in equation [T6l give good results with the RKF 
integrator when used alone with global timesteps, but 
that errors become unacceptable when used alone with 
individual particle time steps. Adding another criterion 
of the form of equation [Tl] alleviated the problem. Simi- 
lar situations may arise with the criteria currently imple- 
mented in VINE when used on different problems. We 
have therefore designed the error criteria code as a set of 
independent routines for calculating specific error crite- 
ria, each with the same interface and each called from a 
master routine, whose sole purpose is to serve as a loca- 
tion at which criteria may be included or excluded. The 
selection of which criteria to use and the addition of other 
criteria can be done by the user with minimal difficulty, 
and the timestep itself is computed from the minimum 
of all active criteria. 

2.4. The Individual Timestep Scheme 

In many astrophysical contexts, it is necessary to 
model the evolution of regions with densities (or other 
quantities) that are orders of magnitude larger than those 
of other regions in the same simulation, or that change 
orders of magnitude more quickly. Such large variations 
naturally introduce a wide range of physical timescales. 

Although it might be desirable in some cases to evolve 
all particles with a single time step in order to maintain 
a highly stable integration, the computational expense 
of doing so in all cases can be prohibitive. Instead, it is 
possible to assign time steps for each particle on an indi- 
vidual basis as required by a given particle and thereby to 
evolve the particles independently. Assigning individual 
time steps to the particles can speed up a simulation con- 
siderably, since little processor time is wasted on evolving 
less dynamic regions with the same small time steps as 
the most dynamic regions present in the system. 

Users of VINE can select whether to run simulations 
with either a single global timestep or an individual time 
step for each particle. Global time steps can be set ei- 
ther to the minimum absolute time step ('global continu- 
ous' mode) or to the minimum binned time step ('global 
binned' mode) corresponding to the time bins that would 
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calculation, defined by the condition that 
/next = min(/o + Idt, h) 



(18) 



Fig. 1. — The individual time stop scheme used in VINE. Only 
the five highest levels of the time step hierarchy are shown here. 
Level represents the maximum time step, level 4 is a factor of 2^ 
smaller. 



be used if the individual time step option were active, as 
described below. The latter option restricts the time 
steps to a discrete set of sizes and therefore may enhance 
integrator stability, particularly for the leapfrog integra- 
tor, in which fixed time steps are formally required to 
retain the symplectic character of the integration. 

When the user selects individual tim e steps, VINE uses 
a vari ant of the scheme proposed by iHernguist fc Kata 
(119891) fo r their Tre eSPH code, b ased on previous work 
of iPorted (pSl and [EwIil(fl98ah . The user first chooses 
a maximum time step allowed for the system, lS.tmax, 
which forms the top of the hierarchy of smaller time step 
levels constructed by dividing the time step on the next 
higher level by 2 (see figure [1]). For time increments 
smaller than the maximum, time is regarded as an inte- 
ger quantity whose maximum value is represented by a 
large power of 2, in VINE set to 2^^. Each shorter time 
step can then be represented by a value 2^*~" where n 
is the time step level counted from the top (see figure 
[T]). Multiplying each integer by the smallest real val- 
ued time step increment recovers the true, real valued 
time or time increment, At, as needed. Time updates 
smaller than the maximum are updated in integer in- 
crements, thereby avoiding errors associated with time 
drift or finite precision truncation errors which may oc- 
cur when many small, real valued quantities are added 
together. After the integration has proceeded through a 
full IStrnax stcp, the real valued absolute time is incre- 
mented by /S.t„iax- 

VINE assigns each particle its own provisional time 
step as described in 12.31 then truncates it to the next 
smaller time step level defined in the scheme. Time step 
assignment is done transparently for both the leapfrog 
and RKF integrators using the same binning scheme. 
Because the number of levels is finite, particles evolve 
forward in groups corresponding to one or more levels 
of the scheme that are currently active, rather than one 
by one on a continuous spectrum of time steps. This is 
important for achieving computational efficiency, both in 
serial and parallel operation, because force computations 
for only one or a few particles at a time are comparatively 
inefficient and difficult to balance among a large array 
of processors running in parallel. Overheads associated 
with repeatedly extrapolating all inactive particles are 
also minimized by the grouping. 

Three integer time variables are assigned to each par- 
ticle, defining the beginning of its current time step, /q, 
its half time step, /i, and its time increment, Idt- At 
the beginning of every update, VINE performs a search 
through the list of all particles to determine the closest 
future time for which any particle requires a derivative 



where the minimum is taken over all particles, i, and the 
resulting value /next defines the time at which the next 
derivative calculation will occur. Using the value of /next, 
VINE sorts particles onto three lists: particles for which 
/next matches their end time step time, /q -I- Idt, those 
for which /next matches their half time step time /i , and 
those for which neither criterion applies. The three lists 
correspond to particles requiring 'full' updates to the end 
of their time step, those requiring 'half updates at its 
midpoint, and those which do not require any update at 
all. Simultaneously, VINE calculates a real valued time 
step increment, Ar, for use during the integration for 
each particle. For particles at either their half or full 
update step. At is identical to the particle's integration 
step, Ai, otherwise, it defines the difference between the 
current time and the time of its last update. The val- 
ues of Ar are then given to the integrator, all particles 
are extrapolated to the current time, a derivative calcu- 
lation is performed and the system is updated. Upon 
completion of the update, the values of /i for all parti- 
cles reaching their half step time are temporarily reset 
to large values, in order to permit discovery of their end 
time step time via equation 1181 

The sequence of updates for different time step levels 
proceeds in different order for the leapfrog and RKF inte- 
grators. For the RKF integrator, particles on more than 
one time step level can be active at the same time because 
a half update time at one level will correspond exactly to 
a full update time at all lower levels. Time therefore pro- 
ceeds monotonically forward through the hierarchy. As 
particles on finer levels complete their time steps, parti- 
cles on coarser levels join in and are integrated forward 
as well. In contrast, because the leapfrog is an offset 
'DKD' integrator, particle updates can occur for only a 
single time step level at a time because half time steps 
on one level do not correspond to the times of half steps 
on any other level. Also, when particles on coarser lev- 
els become active, they advance forward by an increment 
larger than a full time step on a fine level. This is impor- 
tant because it means that inactive particles on coarse 
levels must be extrapolated both forward and backwards 
in time at different points in the sequence as the system 
is synchronized in preparation for the next derivative cal- 
culation. No particle is ever extrapolated more than one 
half step in either direction however, so the extrapola- 
tion remains within the range spanned by the original 
update. 

3. SMOOTHED PARTICLE HYDRODYNAMICS 

The Smoothed Particle Hydrodynamics (SPH) tech- 
nique for simu lating hydrod ynami c phenomena was first 
described by iLucvl ()1977[ ) and iGingold fc Monaghanl 
yjTZ). Since then, i nuch effort has gon e into 
developing the method ( Ginaold & Monaghan 19821 : 
iMonaghan fc Gingoldl ll985tlMonaghan 1985., .198^ ) and 
it has become widely used in the stud y of many astro- 
physical problems fsee lBend[l988lll990tlM"onaghanlll992l 
for reviews). 

There are a variety of formulations of the SPH equa- 
tions, different in various details. In the following, we 
present the formulation implemented in VINE, and will 
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discuss differences from other formulations and some of 
the corresponding advantages and disadvantages sepa- 
rately in section [231 

SPH solves the hydrodynamical equations in La- 
grangian form and can be regarded as an interpola- 
tion technique: the positions of the SPH particles com- 
bined with an interpolation kernel define the fluid quan- 
tities throughout the flow. By default, VINE imple- 
ments the widely used W4 sm oothing kernel defined by 
iMonaghan fc Lattanziol (|1985D as: 



1 



3„2 , 3 
3 



if < t; < 1. 







ifl<w<2, (19) 
otherwise, 



where i/ is the number of dimensions and a is the nor- 
malization with values of 2/3, 10/(77r) and I/tt in one, 
two and three dimensions respectively. Fluid quantities 
at the position of particle i are then obtained as weighted 
sums over the properties of all its neighboring particles. 
For the density, this reads 



N 



(20) 



where W is the kernel function defined by equation [12] 
and nij are the masses of the N neighboring particles. 

The dimensionless separation, v = Vij/hij, between 
particles i and j, requires the actual magnitude of the 
separation, r^j = jr^ 



and their characteristic 



'smoothing' length scale, hij, defined as the mean of the 
smoothing lengths of the two particles: 



hij = {h^ + hj)/2. 



(21) 



Thus, with these definitions and the W4 kernel, particles 
whose separations are v < 2 contribute to the summa- 
tions as 'neighbors'. Also, the influence of particles on 
each other is symmetric under interchange of indices, an 
important characteristic required to ensure conservation 
of momentum and other quantities. All other quantities 
requiring symmetrization are defined similarly in VINE 
(see also section [531 below). 

The choice of kernel used in VINE is made via an ex- 
ternal module, so users may substitute an alternate if 
desired with little or no change to the rest of the code. 
The W4 kernel above is second order in the interpolant 
and has the advantage of being defined on compact sup- 
port: particles more distant than w = 2 do not contribute 
to the sum. As written (see Vine2, section 3.5), VINE 
currently defines neighbors as particles with w < 2, con- 
sistent with the definition appropriate for the W4 kernel. 
Other choices of kernel might include Gaussian kernels 
used more commonly earlier in the history of SPH, which 
are not defined on compact support at all, or other spline 
kernels, perhaps compactly defined over a different range 
of separations, v. For these kernels, a new neighbor def- 
inition must be made in the code, to define an artificial 
cutoff or new separation range for the neighbor search. 
This modification requires a change of only a single line 
of code, and will therefore be trivial to implement. 

3.1. The SPH formulation of the equations of 
hydrodynamics implemented in VINE 

3.1.1. Additions to the Momentum Equation 



When gas dynamics are included in a simulation, an 
additional term must be included to equation [21 which 
models forces due to pressure gradients and in its ideal 
form is: 

dvi Vp 

— - (22) 
P 



dt 



where p and p are the pressure and mass density of the 
fluid, respectively. This form defines the momentum 
equation appropriate for 'ideal' fluid dynamics. Explicit 
terms to model viscous processes following the Navier- 
Stokes formalism are not included. The SPH formulation 
used in VINE casts equation [22l in the form: 




ViW^(rij,/iy) . 



(23) 



where Vi means take the gradien t with resp ect to the 
coordinates of particle i (see e.g. iBenzl [l990() , p is the 
gas pressure and p, the density, is given by equation [201 
An additional term, Hy , appears in equation 1231 but has 
no counterpart in equation 1221 Its purpose is described 
in section r3. 1.31 and takes the form of an artificial viscous 
pressure included to model dissipative effects, without 
which there is no mechanism to convert kinetic energy 
into heat due to non-reversible processes such as shocks 
or viscosity. Alternatively (or in addition), one might 
include a dissipative term modeled on the viscous terms 
found in the Navier-Stokes equations directly, however, 
in its present form VINE does not include such a term. 

3.1.2. The energy equation and equation of state 

The change in the thermodynamic state of the 
fluid requires an evolution equation for a state vari- 
able corresponding to its internal energy or entropy. 
ISpringel fc Hernguistl (|2002D compare the benefits of us- 
ing one or the other formulation. VINE employs an equa- 
tion for the specific internal energy of the gas, defined for 
each particle. In the simplest case of an ideal gas with 
no external heating or cooling terms, only compressional 
heating and cooling are important and the equation takes 
the form: 

^ = -^V.., (24) 
dt p 

where v is the local fluid velocity. Analogously to the 
momentum equation above, the SPH formulation used 
in VI NE casts this equation with two terms as (jBenj 
[1993): 

'^'"^ V,M^(ry ,/iy ) 

+ 2 H™j n*jv„ • VjW^(r,j,/i,j). (25) 
i=i 

where the first term corresponds to reversible {'PdV') 
work. As for the momentum equation, the second term 
including the viscous pressure Hij has no counterpart in 
equation 1241 It models the irreversible thermal energy 
generation produced by the artificial viscosity as it at- 
tempts to model shocks or turbulent energy dissipation. 
A master routine is available for users of VINE to in- 
corporate additional sources or sinks of internal energy 
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via calls to external routines, written to model a specific 
system. Switches to activate calls to these routines are 
available to users in a text input file, or can be trivially 
added to it, which is read in by VINE at run time. 

To close the set of equations, an equation of state must 
also be defined to relate the internal energy, density and 
pressure to each other. VINE includes options to call sev- 
eral simple equations of state, such as isothermal, isen- 
tropic and ideal gases. While the simple choices included 
will suffice for many problems, we recognize that any 
set of equations of state will not be sufficient in general. 
Therefore, as for energy sources and sinks, a master rou- 
tine is available for users to incorporate additional equa- 
tions of state into VINE with minimal difficulty through 
calls to external routines, activated by switches set in a 
text input file. 

3.1.3. Artificial Viscosity 

VINE incorporates an artificial viscous pressure in its 
momentum and energy equations to model irreversible 
thermodynamic dissipation from shocks and viscosity. 
Following standard practice, the viscous pressure in- 
cludes both a bulk viscosity (the 'a' term) to elimi- 
nate subsonic velocity oscillations and a von Neumann- 
Richtmyer viscosity (the '/?' term) to convert kinetic en- 
ergy to thermal energy and prevent particle i nterpenetra- 
tion in shocks (jMonag han fc GingoTdHTosl . Formally, 
the expression for 11^ is 



n, 



<o, 

> 0, 



(26) 



where scalar quantities with both indices i and j are sym- 
metrized values using equation [34] in section 13.31 below. 
Vector quantities with two indices represent differences, 
e.g. Vy = Vj — Vj, and /i^ plays the role of the velocity 
divergence: 



My 



I I] 



(27) 



with 77 10~^ — 10~^ to prevent singularitie s. 

In its original form ('Lattanzi o et al.l 1X9861) . fij is set 
to unity in equation [?71 Balsara ( 199Cll . ll995f) found that 
this form gives rise to large entropy generation in pure 
shear flows, which he suppressed by defining an addi- 
tional factor / to reduce the contribution selectively in 
such flow conflgurations. He defines /,; as 







•v.)| 




l(V-v.) 


-1- 


(V X V,) 


+ 77' 



(28) 



with rj' preventing divergence once more. VINE allows 
the user to choose whether or not to allow the factor / 
to vary or to set it permanently to unity for all particles 
via a run time option. 

The factors a and (3 in equation [21] are parameters 
controlling the strength of the artificial viscosity. The 
best choice for their values depend somewhat on the 
pro blem being addressed in a particular simulation (see 
e.g. lSteinmetz fc Miilleilll993l : iLombardi et al.lll999L for 
some common choices) , but values near a = \ and f3 = 2 
are most commonly used in the literature. VINE al- 
lows users to set both values in an input file or, at the 
user's option, to be set dynamically as conditions war- 
rant at different times or locations in a simulation. The 



latter sett ing utilizes a varian t of th e formulation de- 
signed bv iMorris fc Monaghanl (|1997l ). where each par- 
ticle is assigned time dependent viscous parameters, to 
minimize unwanted viscous dissipation in quiescent fiows 
while retaining good reproduction of the flow properties 
in shocked regions where it is required. In both standard 
test problems and in realist i c cosm ological models of of 
galaxy clusters. [Polag et al.l (|2005l ) found such a time de- 
pendent AV formulation to be very useful, yielding much 
more accurate results compared to the standard formu- 
lation with fixed parameters. 

Following iMorris fc Monaghanl Il997l each particle is 
assigned its own value of the viscous coefficient, a,, which 
changes in time according to a source and decay equation 
taking the form: 



dai 
~dt 



oii - a* 



St. 



(29) 



The first term forces the value of to decay asymptot 
ically to a value of a* on a time scale r^, given by 



c,V(7-l)/27' 



(30) 



where 6a is a factor to relate the decay time scale to 
a more convenient decay length scale describing the 
distance over which the coefficient decays behind the 
shock. The form of th e decay time is derived by 
IMorris fc Monaghanl Il997l from the post shock Mach 
number for strong shocks, combined with the sound 
speed and smoothing length. 

The second ter m in equation [29] is a so urce term of the 
form discussed bv lRosswog et al.l ()2000[ ): 



Si = S'omax(-/j(Q!n 



a»)V-v„0) 



(31) 



where amax is a maximum value of the viscous coefficient 
chosen to ensure that repeated strong compressions or 
shocks do not increase the viscous parameter to values 
much larger than those known to yield good results in 
test problems. The source function depends on the ve- 
locity divergence, so that grows in strong compressions 
as required, and also includ es the Balsara coefficient fro m 
equation [28] as suggested bv lMorris fc Monaghanl[l997l in 
order to suppress growth of the source function in shear 
flows. The scale factor Sq accounts for equations of state 
with 7 ^ 5/3: 



5/3 -hi-,,, ,7 + 1. 
-5^0 = [H-rTZ — 



(32) 



^5/3-1'"^ W-1' 

and is employed to ensure that the peak value of a re- 
mains more nearly independent of the equation of state. 

As implemented in VINE, each of the parameters a, 
/3, ttmax, a* and 5a may be set at run time by the user 
with appropriate choice made in a text input file. Both 
the fixed and time dependent viscosity implementations 
retain the von-Neumann Richtmyer term for each parti- 
cle, with the ratio of the two terms [3 /a also set by the 
user and fixed to the same value for all particles. 

3.2. Variable smoothing lengths 

With few exceptions, astrophysical systems exhibit 
moderate or large density contrasts and ideally one would 
like to resolve these contrasts as well as possible in sim- 
ulations. The accuracy of the interpolation of the SPH 
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scheme depends on the number of neighboring particles 
taken into account in sums like equation[2Ql which makes 
high numbers of neighbors (large h values) desirable. 
On the other hand, large neighbor counts increase the 
computational effort and decrease the spatial resolution, 
since only scales larger than h are resolved. These two 
competing effects lead to a requirement that the num- 
ber of neighbors should stay roughly constant at a level 
where correct interpolation is assured without wasting 
computational resources. Thus the particle smoothing 
lengths, /i, need to vary in space and time, i.e. each 
particle i gets its own time-dependent smoothing length, 
hi{t), which is then integrated according to: 



dhi 1 hi dpi 1 
dt Ud Pi dt rid 



(33) 



where Ud is the number of dimensions and the continuity 
equation has been used to replace the time derivative of 
the density (|Bend |1990[ ) . If the integration of equation 
[33l leads to a neighbor count outside a given range, an 
exponential correction term pushes \hi\ to greater values, 
such that more/less neighbors are found on the next step 
if too few/many were found on the current step. VINE 
attempts to keep the neighbor count within the range 
[30, 70] in 3D and [10, 30] in 2D. These ranges have 
been deliberately set comparatively broadly because they 
depend somewhat on the total number of particles (see 
iLombardi et al.l 1 19991 and references therein) and on 
the physics itself (|Attwood et al.ll2007f) . We recommend 
that in actual use, users set somewhat more restrictive 
ranges, via parameters that are changed when the code 
is compiled, and based on discussions in these references. 

The error introduced into the interpolation scheme by 
allowi ng for varying smoothing lengths is secon d order 
in h (jMonaghanl 119851 iHernquist fc Katdll989f ). Since 
SPH with constant smoothing lengths is also only accu- 
rate to se cond order in h with the kernel in equation [19] 
(see e.g. iBenzj [l990l ). the addition of variable smooth- 
ing lengths imposes no additional restriction. However, 
variable smoothing lengths formally introduce an ad- 
ditional term in the interpolated forms of the gradi- 
ents of functions, as these forms involve gradients of 
the kernel. These additi onal terms ca n be neglected 
in many simulations (see lEvrardl Il988l ). but doing so 
may cause spurious entropy generation in some cases 



Nelson & Paoaloizou 
Price fc Monagha: 



)us entropy generation m some cases 
^ '1994 ISpringel fc Hernquls^ l2002t 
n 200|rin its present form, VINE ne- 
glects the additional terms, but modifications to the code 
to implement them via the 'variational methods' deriva- 
tions of SPH discussed below are planned for the future. 

3.3. Symmetrization of quantities in SPH 

The description of the SPH equations so far has 
only dealt with the specific form i mplemented in VINE. 
Other form ulation s exist (e.g. iGingold fc MonaghanI 
ll982tlEvrardlll988i : IHernquist fc Katzlll989[ ). but a thor- 
ough review of even the most c ommon is beyond the 
sco pe of this paper. We sugge st iThacker et al.l (|2000l ) 
and lSpringel fc Hernguistj (|2002| ) for a more detailed dis- 
cussion of the alternatives, and limit ourselves here to 
an outline of a few of the major points regarding the 
differences and why we use the scheme implemented in 
VINE. 



An important difference between SPH implementa- 
tions comes from the way force symmetry between pairs 
of particles is handled. For momentum conservation, 
symmetry between interchange of indices in the pairwise 
force calculation between two particles is required. As 
noted above, VINE uses the arithmetic mean of quanti- 
ties defined for each particle to produce the symmetry: 



(34) 



where q is some quantity intrinsic to a particle such as 
smoothing length or sound speed. Specifically for the 
calculation of the kernel and its gradient, VINE uses a 
symmetrized smoothing length, htj, so that the smooth- 
ing kernel defined for two particles is 

W,,=W{\n^Vil{h, + h,)/2). (35) 

Alternatively, some implementations derive force sym- 
metry from a symmetrized kernel. Rather than defining 
W{rij,hij), these implementations define Wij as 



1 



1, 



W,,^-Wi\v,-rilh,) + -Wi\ri-Vi\,h,) 



(36) 



as originally sugges ted bv iHeriiauist fc Katd (119891) 
and u s ed e.g. bv iDave et al.l (Il997fl: ICarraro et all 
([1991: iLia fc Garrard (I2000D : ISpringel et all (|2001h and 
iWadslev et all (|2004D . 

The choice of kernel symmetrization has direct impli- 
cations for the neighbor search. To assure force symme- 
try, neighboring particles i and j must be determined 
to be mutual neighbors. Some code s use range search 
techniques (see e.g. .Steinmetz 1996; Springel et al.l[200lL 
note that only the serial version of Gadget- 1 uses this 



technique). In this case, all particles j with 



<T]hi 



are used as neighbors of i, with rj > 1 and typically 
77 « 1.3. This method does not guarantee that both 
particles i and j will find each other as neighbors. In- 
stead, VINE uses a node opening criterion based on the 
quantity hij = {hi + hj)/2, which is defined identically 
for all particle/particle pairs, particle/node pairs and 
node/node pairs contained in the tree. It therefore as- 
sures by construction that the mutual neighbor property 
is satisfied for all particle pairs i,j. The higher com- 
putational cost (if any) required to complete searches is 
more than offset by the fact that symmetrizing the ker- 
nels through equation [36] leads to evaluating the kernel 
function and its gradient twice, while VINE only needs 
to evaluate them once. 

A second important difference in symmetrization 
comes from differences in the forms of the momentum 
and energy equations. Rather than calculate a sum of 
two quantities, one from each particle in a pair, a geo- 
metric mean might be used, or symmetry requirements 
relaxed altogether and a non-sy mmetric form of the 
equati ons of motion implemented. ISpringel fc Hernguistj 
([2OOI compare these alternatives on several test prob- 
lems, along with two other formulations recasting the en- 
ergy equation instead as an entropy equation. They find 
that of the three formulations using an energy equation, 
the best alternative is an arithmetic mean. This result is, 
however, secondary in significance to their finding that 
in cases where the system is under-resolved, either spa- 
tially or temporally, all of the energy formulations they 
studied may produce inaccurate results, as does one of 
their entropy formulations. 
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Instead, they recommend employing an entropy equa- 
tion using a symmetrization based on a novel variational 
methods derivation of the equations of motion and of 
the entropy equation itself as formulated in SPH. This 
form explicitly conserves entropy in adiabatic flows, while 
avoiding the errors they demonstrated in the energy for- 
mulations at low resolution. Roughly speaking, this for- 
mulation permits unequal weights of the terms in the 
arithmetic means to be generated on a pairwise b a sis de - 
pending on the details of the flow. iMonaghanl ()2002f ) 
also provides a variational methods based derivation of 
the equations of motion, while retaining an energy equa- 
tion formulation, but does not specifically investigate its 
effect on the outcome or accuracy of any test problems. 

When properly resolved, simulations performed 
with each of the five formulations studied by 
ISpringel fc Hernguistl ()2002| ) yield results consistent with 
each other. Therefore, improvements derived from the 
variational formalism will be of most value when resolu- 
tion is limited, extending the effective resolution of dy- 
namical features in a simulation to much lower particle 
counts. Users may properly be cautioned merely to em- 
ploy resolution in their simulations always sufficient to 
ensure accurate results. Of course, such a requirement 
can rarely be met in practice because of the highly prob- 
lem dependen t definition of 'sufficient resolution'. As 
iNelsonI (|2006l ) noted in discussions of resolution require- 
ments for disk simulations, formulations which permit 
low resolution 'failure' to be more graceful or more ac- 
curate will be preferable to those whose failure modes 
produce larger errors and inaccuracies. The fact that 
the variational formalism provides such assurances is a 
point greatly in its favor and is a planned addition to 
VINE. At present however, a variational formulation of 
SPH has not yet been incorporated. 

3.4. Additional time step criteria for SPH simulations 

Two additional time step criteria are required in SPH 
simulations. First the Courant-Friedrichs-Lewy condi- 
tion must be satisfied: 



TCFL 



Ci + 1.2 (ofi Q -I- h, maxj /iy ) 



(37) 



where ai and f3i are the artificial viscosity parameters 
and Ci is the sound speed for particle i, and /Zy is defined 
by equation [27] with the maximum taken over all neigh- 
boring particles j of particle i. The fo rm of equation 1371 
is that suggested bv IMonaghanl ()1989f ). who recommend 
values of tcfl ~ 0.3 for good results. 

Secondly, VINE requires that the smoothing length of 
each particle must not change too much in one timestep: 



Aj-n+l _ 



(38) 



where is a tuning parameter. We set w 0.1 — 0.15 in 
order to ensure that particles encountering strong shocks 
require several timesteps to pass entirely through the in- 
terface. For a gas with a ratio of specific heats, 7 = 5/3 
for example, the density enhancement across a strong 
shock will be a factor of four, corresponding to a smooth- 
ing length change of 4^/'^ « 1.6 in 3D. With a density 
jump of the severity and suddenness experienced in a 
shock, the timestep restriction of equation l38l will become 



important and the particle's timestep will decrease, al- 
lowing better resolution of the physical conditions of the 
shock interface. With Th = 0.15 for example, a particle 
will require at least ^4 — 5 timesteps to contract fully 
to the post-shock condition. 

The criterion in equation 138! is also beneficial in situa- 
tions where the spatial gradient of neighbors is large, as 
it may be for particles on the surface of an object. Small 
changes in the particle's smoothing length can then lead 
to very large changes in its neighbor count and ultimately 
to large oscillations between far too many and far too few 
neighbors, from one timestep to the next. A timestep 
restriction dependent on the smoothing length varia- 
tion restricts such particles to correspondingly smaller 
timesteps, so that such oscillations do not develop. 

Preventing or damping such oscillations is important 
both because of the comparatively high cost of comput- 
ing the evolution of such boundary particles, but also 
because they can lead to oscillations in the physical quan- 
tities such as pressure forces, enhancing oscillations fur- 
ther as particles experience large intermittent kicks. Al- 
though certainly more fundamental in a physical sense, 
in a numerical sense quantities such as density are actu- 
ally derived from numerically relevant smoothing length 
of a particle. Therefore, rather than limiting the change 
in derived quantities such as density, VINE limits the 
change to the particle's smoothing length, as a more di- 
rect throttle on unphysical behavior. 

4. GRAVITY 

The most costly calculation in nearly any particle sim- 
ulation including gravity is the computation of the gravi- 
tational forces of the particles on each other. This is due 
to its long range nature, which couples all particles in 
the system to each other. Modeling many astrophysical 
systems requires forces accurate only to 0.1 — 1% and, 
if a system is collisionless, a force error of this magnitude 
leads only to a marginal decrease of its coUisional relax- 
ation time, therefore only minim ally altering the overa ll 
evolution of the system (see e.g. iHerng uist et al.lll993f ). 
For such systems, an approximate solution will be not 
only acceptable, but much to be desired if it can be ob- 
tained much more quickly than an exact solution. For 
other models, e.g. star clusters, highly accurate forces 
must be determined for accurate evolution in spite of 
their cost; an approximate solution will be useless. VINE 
implements both approximate and exact methods for cal- 
culating mutual gravitational forces of particles on each 
other. In this section, we provide an overview of our im- 
plementation of these alternatives. We refer the reader to 
Vine2 for a lower level description of the detailed meth- 
ods. Users can choose from among the available options 
at run time an appropriate choice in a text input file. 

4.1. Exact gravitational forces obtained from direct 
summation 

The most naive approach to calculating the magni- 
tudes of interactions of particles on each other is sim- 
ply to calculate directly a sum of the terms due to every 
particle on every other particle: 



a. 



^ \r. 
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(39) 
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where and rj are the positions of particles i and j, f^j 
is the unit vector connecting them and rrij is the mass of 
the j'th particle. This calculation is extremely expensive, 
requiring 0{Np) operations for every time step. VINE 
includes two alternatives for computing the sum in equa- 
tion [35] for every particle. The sum can either be com- 
puted directly by the processor, or it can be computed 
instead by special purpose hardware, so called 'GRAPE' 
coprocessors, described in section l473l below. 

4.2. Approximate gravitational forces obtained from tree 
based sorting 

When physical models do not require or do not allow 
for exact inter-particle gravitational force calculations, 
as will be the case for so called 'collisionless' systems, 
approximate forces may become a desirable alternative 
if they speed up the required computations while still 
retaining sufficient accuracy. For gravitational force cal- 
culations, approximate forces can be obtained by aggre- 
gating the contributions of distant particles into a single 
interaction, for which all of their individual contributions 
together can be approximated as being due to a sum of 
multipole moments, perhaps truncated to some low or- 
der. 

The difficulties in this approach lie in establishing how 
big each aggregate can be before errors in the force be- 
come overwhelming, and in sorting through all of the 
possible nodes, making certain that every particle is in- 
cluded in the force calculation for every other particle 
exactly once, either as an individual or as part of an ag- 
gregate. The most common and most general purpose 
method for obtaining such approximate forces is to or- 
ganize the particles into a tree data structure, and then 
use tree nodes as proxies for groups of particles. By 
examining successive nodes in the tree, all of the par- 
ticles contained in that node can be either qualified as 
interactors, or the node can be opened and its children 
examined for acceptability instead. Fully traversing the 
tree produces a list of nodes determined to be acceptable 
for interaction with a given particle, and a list of atoms 
(single particles), for which an exact computation is re- 
quired. Using a tree to determine a list of neighbors or 
acceptable nodes reduces the overall computational effort 
to 0(iVp log A^p). 

The challenge in making the use of a tree efficient are 
first to choose an efficient method to traverse the tree 
and, second, to choose an efficient method to decide 
which tree nodes are acceptable as is, and which must 
be separated into their constituent parts to be examined 
in more detail. We will describe both the traversal strat- 
egy and the node acceptability criteria used in VINE in 
detail in Vine2. For purposes here, it will be sufficient 
to describe qualitatively the criteria used to determine 
node acceptability. 

In order to calculate an accurate gravitational force 
due to some node which defines a particle distribution, 
the error it contributes to the total force on a particle 
must be small. Mathematically, for a multipole expan- 
sion to converge at all, this condition translates to the 
physical statement that the particle on which the force 
acts must be remo te from the ma ss distribution defining 
the node (see e.g. Jac ksonlll975L ch. 4). In addition, 
if the multipole expansion is truncated rather than con- 
tinuing to infinite order, as will be desirable in an ap- 



proximate calculation in order to save time, errors corre- 
sponding to the neglected contributions from the higher 
order terms must also be small. 

VINE implements three runtime selectable options for 
determining the acceptability of a node to be used in the 
gravitational force calculation, each based on a different 
implementation of the convergence radius of a multipole 
expansion and of limits on the errors due to series trun- 
cation. Following common practice, we call each of these 
options 'Multipole Acceptance Criteria', or MACs. Each 
MAC includes a user settable parameter '6*' by which the 
accuracy in different problems may be tuned to the most 
suitable value for that problem, but the interpretation 
given to is specific to each MAC. 

The first and simplest is the 'geometric' MAC: 

4 ^^'r^t, (40) 

where hi and hj are the physical sizes of the particle and 
node, respectively and r^j is the square of their separa- 
tion. The accuracy parameter 9 takes a value between 
zero and one, and parameterizes the minimum accept- 
able distance at which a node may be used in the gravity 
calculation. Alternatively, by switching the positions of 
9 and Rcrit and setting hi — it may be interpreted in- 
stead as the tangent of the angle subtended by the node 
on the 'sky' as seen by the particle. We incorporate the 
size of the particle, hi, into the MAC in order to ensure 
that the condition is satisfied for all locations inside it, 
and to allow a generalization of the MAC to be used for 
groups of particles taken together (see Vine2 for addi- 
tional details) . Errors due to truncation of the multipole 
expansion are implicitly assumed to be small, because 
each higher order term is diminished by an additional 
power of the separation, r, appearing in the denomina- 
tor of each multipole term. 

Second, we implement the absolute error criterion of 
iSalmon fc Walrenl dTool (the 'SW MAC), who derive 
explicit limits on the errors due to series truncation. A 
node, j, is acceptable for use in the gravity calculation if 

where is a value defining the maximum absolute error 
in the acceleration that a single node may contribute 
to the sum and Qj is the quadrupole moment tensor 
for node j. When the quadrupole moment is zero, this 
criterion reduces exactly to the geometric MAC above 
with its 9 defined to be unity, defining a simple separation 
cri terion. We note here also that the original formulation 
of ISalmon fc WarrenI (|1994D includes the possibility of 
including the size of the particle, hi, directly in the term 
under the square root. Instead, we choose the form in 
equation |41] in order to make possible a single calculation 
of the MAC definition for each node prior to any tree 
traversals. 

Finally, we imp lement the MAC discussed in 
iSpringel et al.l (|2001[ ). which we refer to as the 'Gadget' 
MAC, and which uses an approximation for the trunca- 
tion error of the multipole expansion at hexadecapole or- 
der to define an error criterion. A form using an octupole 
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order error formulation can be derived, but is computa- 
tionally costly for modern computers because it contains 
a square root, and is therefore not used. For the relative 
error in the acceleration contribution of the node, com- 
pared to the total acceleration at the last time step their 
criterion is: 



7-6. > 
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(42) 



where the gravitational constant implicitly present in the 
formula is set to G = 1, add is the value of the accelera- 
tion for a particle at the last time it was calculated and 
is a dimensionless maximum relative error in the accel- 
eration to be allowed to any acceptable node. Because 
it requires a previous value of the acceleration, equation 
l42l cannot be used for the first calculation. Instead, we 
use the geometric MAC of equation [40l with its accuracy 
parameter set to Q — 0.5. Also, equation [42] may not 
ensure that particle and node do not overlap in space, 
violating the separation condition required for conver- 
gence of the multipole summation. Therefore, when the 
Gadget MAC is selected, we also require that equation 
l40l with its parameter set to 6* = 1, is satisfied. 

4.2.1. The acceleration calculation 

To compute the gravitational acceleration due to each 
entry on the list of acceptable nodes on some particle, 
i, users of VINE may choose one of two options. First, 
they may choose to sum the contributions from accept- 
able nodes and atoms using a multipole summation trun- 
cated at quadrupole order and computed on their general 
purpose 'host' processor or, alternatively, they may in- 
stead choose to compute the summations to monopole 
order using a GRAPE coprocessor, if one is available. 

For the former, VINE sums the multipole series for 
each acceptable node using the un reduced quadrupol e 
moment formulation described in iBenz et "aTl (|1990r ). 
Mathematically, the acceleration on particle i due to 
node j is: 



TrQ,- 



where the function /(r) — Gr^'^, r= rj — Ti and Qj 
is the quadrupole moment tensor for node j. No spe- 
cific gravitational softening (see section |4^ is required 
in equation 1431 because the criteria used to determine the 
list of atoms ensures that no nodes will require soften- 
ing. Atoms are handled independently and contribute 
only a monopole term, softened according to the options 
discussed below. The acceleration due to all atoms and 
nodes together are summed to obtain the gravitational 
acceleration acting on a particle, i.e. the right hand side 
of equation 

4.3. GRAPE hardware 

As noted above, VINE includes options to cal- 
culate gravitational forces using GRAPE copro- 
cessors if they are available, either in approxima- 
tion or exactly. 'GRAPE' hardware (for GrAvity 



PipelinE) to accelerate A^-body calculations has 
been developed by s everal collaborations i n Japan 
(ISugimoto et all Il990t iFukushige et al.l 119911: llto et all 



. 

19911: lOkumura et al.l 119931: iMakino et al.l Il997l: 
Kawa i et al.l 120001 : iMakino et al. 2003: F ukushige et all 
2005i . see also IMak ino &: Taiji 1998 for a review), 
and has been used both by them and by many others 
throughout the astronomical community (see, e.g., 
Naab fc Burkerd 1200 3: Athanassoula 2003; Naab et al 



2006at iMerritt fc Szelll 120061: IPortegies Zwart et al. 
20061 : iBerczik et al.ll2006D . This hardware is attractive 



to users because it is a system of specialized computer 
chips designed to perform a summation of 1/r^ force 
calculations very quickly using a highly parallelized 
pipeline architecture. These chips are combined onto a 
processor board which also hosts all other other neces- 
sary functional units, including memory, I/O controller, 
etc, and which is then attached to a host computer, 
usually a desktop workstation, which performs the rest 
of the calculations required in a simulation. 

Two types of GRAPE boards exist, differentiated by 
even or odd version numbers. GRAPE boards with odd 
numbers have a less accurate numeric format, so that the 
relative error of the t otal force on a particle is of the order 
of 2% for GRA PE-3 (lOkumura et aD fT99l and 0.3% for 
GRAPE-5 (K awai et al.ll200G[ ). For colHsionless systems, 
this imposes no problem for the time evolut ion, as the 
errors are uncorrelated (|Makino et al.lll990D . GRAPE 
systems with e ven version numbers, such as G RAPE- 2 
(|Ito et al.lll99l[ ). GRAPE-4 (Makino et al."1997) and the 
most recent, GRAPE-6 (Makino ct al. 2003) have higher 
accuracy in their internal numeric formats and were de- 
signed for simulating of coUisional systems like globular 
clusters. 

VINE supports GRAPE-3, GRAPE-5 and GRAPE- 
6 boards, the latter in both a full and reduced 
size 'MicroGRAPE' fo rm, also known as 'GRAPE-6A' 
(|Fukushige et all l2005f) . The code can use GRAPE 
boards for direct summation of forces as well as a com- 
ponent in an approximate tree based gravity calculation. 
We describe the details of this approach in Vine2 after 
the tree itself has been described in detail. 

4.4. Softening the forces and potential 

Simulations of physical systems depend on a choice to 
assume that the system is either 'collisional' or 'collision- 
less', corresponding to the statement that the evolution 
of the system as a whole is or is not strongly affected by 
the outcomes of interactions between individual parti- 
cles ('collisions'). In hydrodynamic systems for example, 
collisions between individual atoms or molecules matter 
only in the aggregate, as a pressure. In A^-body systems 
of galaxies for another example, where individual parti- 
cles may in fact be stars rather than atoms, a similar 
statement can be made if the two body relaxation time 
is long compared to the expected simulation time or the 
lifetime of the system. 

Even though the underlying physical systems may be 
collisionless in the sense that no particular interaction 
affects the result, simulations of them may not be be- 
cause the number of particles used in the simulation is 
typically many orders of magnitude smaller than the ac- 
tual number of bodies in the real system. Particles in 
these simulations are actually meant to represent an ag- 
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gregate of some very large number of physical particles. 
Actual collisions between them, as individuals, are there- 
fore unphysical. In order to recover the coUisionlessness 
of the physical system, forces between particles must be 
'softened' in some manner or, in other words, the parti- 
cles must be provided with an internal density distribu- 
tion consistent with their interpretations as aggregates 
of many particles rather than as a distinct entity. 

In practice, softening is achieved by modifying the 
gravitational potential on small scales to avoid the pure 
1/r form and the associated numerically infinite forces at 
small separations. The softened form actually defines a 
mass density distribution for each particle, so that they 
include an assumption of some spatial extent, rather than 
that they are point like objects. Using softening, the 
artificial close encounters between iV-body particles are 
suppressed and the particles evolve under the action of 
the global gravitational potential created by all particles. 

4.4.1. Forms of gravitational softening in VINE 

A long standing debate pervades the astrophysi- 
cal literature concerning how gravitational softening 
sho uld be implemented in particle simulations (see 
e.g. iMerrittI [199 6: Bate & Burkcrt 1997; RomcoJlM^ 
lAthanassoula et al.ll2000l : lDehnenll2001l : iNelsonll2006l and 
references therein for detailed discussions). No form of 
softening so far proposed exists which is free of defects 
and some care must be taken to ensure simulation re- 
sults do not depend on the implementation of softening. 
In practice, two alternatives are common; first to use 
'Plummer softening' and second to use a spline based 
kernel, perhaps also with a space and time dependent 
length scale. VINE implements both options, via a user 
selectable switch. 

Plummer softening defines the density function of each 
particle to be a Plummer sphere, so that the force on 
particle i due to particle j, at a distance = jr^ — rj|, 
becomes 
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where is the unit vector connecting particles i and j. 
The potential is defined similarly as 
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where e is a softening lengt h scale. Plumm er softening 
was originally motivated bv lAarsethl ()1963f ). who used 
it in simulations of clusters of galaxies. Its most com- 
pelling advantage is that it is simple to implement and 
computationally inexpensive. Its major drawback is that 
it never converges to the exact Newtonian potential at 
any separation. 

Spline based softening defines the density function of 
a single particle using a predefined kernel that extends 
over some finite region in a manner not unlike that used 
to derive hydrodynamic quantities in SPH. Although in 
principle any kernel could be used, VINE implements the 
kernel defined in equation [TH used for the SPH inter- 
polations. This kernel has the advantages that it has 
compact support, i.e. for r > 2h it takes the exact 
Newtonian form, that for hydrodynamic simulations with 
smoothing and softening set equal, pressure and gravita- 
tional forces are nearly equal and opposite on small scales 



(jBate fc Burkertlll997f) . and that for a given number of 
particles, errors in the force calculati on are smaller than 
those of many other possible kernels (|Dehnenll2001h . 

To specify the force on particle i due to particle j, we 
apply Gauss's law and integrate over the kernel density 
distribution to obtain the fraction of the source particle's 
(j) mass enclosed by a sphere with radius equal to the 
separation between them: 



rh{rij)^4TT / p{v)v dv 
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(45) 



where the density, p, is replaced by the softening kernel 
W in the second equation, and v is as defined in section 
[31 In 2D, a similar form for rhj can be defined but leads 
to a finite, non-zero force at zero separation because the 
conceptual framework that underlies the definition does 
not carry ove r unaltered from three dimensions to two 
(lNelsonl[2i006h . VINE instead modifies the form of the 
kernel in 2D to that proposed bv iNelsoiil 120061 to avoid 
the inconsistency. Given the modified definition of the 
source particle's mass, the force and potential are defined 
by the equations for Newtonian gravity: 



and 
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Note that as two particles come closer (without decreas- 
ing their softening lengths), the force decreases to zero 
because the mass enclosed decreases to zero, and that 
the force is anti-symmetric with respect to interchange 
of i and j. At great distances, rhj becomes mj and the 
exact Newtonian form is recovered. 

Three softening variants are available in VINE, via op- 
tions specified at runtime. First, Plummer softening may 
be selected with a fixed softening length e for all particles. 
When the GRAPE option (see section 14. ip is selected 
for gravitational force calculation, this is the only option 
available due to hardware constraints. Second, either 
'fixed' or 'variable' kernel softening may be selected, with 
the latter option affecting only the treatment of SPH 
particles. If the fixed option is selected, each SPH par- 
ticle i is softened using a single softening length hi — e, 
specified at run time for all particles. After the gravity 
calculations are completed, smoothing lengths are reset 
to their locally defined, spatially and temporally variable 
values so that later hydrodynamic calculations may pro- 
ceed. If the variable option is selected, the individual 
(and time varying) smoothing lengths of each particle 
are used as individual softening lengths. In both the 
fixed and variable kernel softening options, all TV-body 
particles use the kernel and their predefined (fixed) soft- 
ening lengths hi. This alternative allows several species 
of A^-body particles to be included, each with their own 
(possibly different) softening length. 

Depending on the softening selection, either terms from 
equation [43l or equation [46l (in the form of accelera- 
tions) are summed over all particles derived from the tree 
traversal to obtain the gravitational acceleration acting 
on a particle, to be in the right hand side of equation [2| 
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4.4.2. Spline softening and the connection with SPH 

Calculation of gravitational forces between neighbor- 
ing SPH particles requires the identification of equation 
1191 as a density distribution, a different and stronger as- 
sumption that its use as an interpolation kernel in non- 
self gravitating SPH. Two important consequences arise 
from the identification. First, although there is no re- 
quirement that the hydrodynamical smoothing length 
and the gravitational softening lengths are equal, large 
force imbalances may develop if they are not, due to the 
different assumptions about the mass distribution within 
a single particle. The imbalances require careful consid- 
eration because they are consequences of the numerical 
assumptions, not of the physical systems, but can catas- 
trophically ch ange the outcom e of a s imul ation. We refer 
the reader to lBate fc BurkertI ()1997D and lNelsonI (|2006l ) 
for more detailed discussions, but note the main conclu- 
sion of both is that more favorable outcomes are obtained 
when gravitational softening and hydrodynamic smooth- 
ing lengths are set to be equal. 

Secondly, equation [45] will introduce fluctuations into 
the potential energy of the system (and of individual par- 
ticles) when it is used with variable smoothing lengths 
because the change in mass distribution within a particle 
is not accounted for in the potential when its smooth- 
ing length changes. In most cases, the variation will be 
small because it comes only from SPH neighbor particles, 
rather than from the entire mass distribution. However, 
in some cases it may be an important local or global ef- 
fect, such as in uniformly collapsing or expanding media, 
or in cases when a small subregion collapses by many or- 
ders of magnitude. This violation of energy conservation 
due to smoothing length variations can be estimated as 
(|Benziil990): 

dH = -G y ^ mimj——dhij. (48) 

Utlij 

In principle this formula could be used to correct the 
potential energy of SPH particles (but not their acceler- 
ations), but we have not included it in VINE because it 
requires an additional search for neighbor particles, with 
its associated computational cost. We also note that a 
conserva tive form of softening has recently been devel- 
oped by iPrice fc Monaghanl (|2007[ ). but this form has 
yet to be implemented in VINE. 

5. A THIRD PARTICLE SPECIES: 'STAR' PARTICLES 

Astrophysical systems frequently include multiple 
physical components with substantially different char- 
acteristics, embedded within and interacting with each 
other. Some examples of such components can be sat- 
isfactory modeled with the two particle types (SPH and 
A^-body particles) already discussed. For example, the 
example simulation in section [7.2.21 includes an interstel- 
lar gas modeled with SPH particles as well as three ad- 
ditional components (stellar disk, stellar bulge, and dark 
matter halo) each modeled with A^-body particles. In 
addition to the possibility that these sorts of fluid and 
non- fluid components could interact, another common 
characteristic in astrophysical systems is the interaction 
between a comparatively low density gaseous medium 
and one or more massive objects, such as stars. Such sys- 
tems include, for example, a single star surrounded by a 



gaseous circumstellar disk (jNelson et al.lll998l |2000D . or 
cloud of g as in which one or m any stars form during the 
evolution (iBonnell et al.|[2006l ). For modeling simplicity, 
each component may require separate treatment in order 
to capture essential physical phenomena peculiar to only 
one without distorting the evolution of another, and in- 
clude explicit coupling terms to model their interactions. 

In addition to the capability to evolve systems simu- 
lated in the purely A'^-body and SPH particle frameworks 
already described, VINE also includes the capability to 
model a third species of particles, similar to A^-body par- 
ticles, but with additional characteristics not present in 
the other types. This species is designed to be used in 
models requiring a small number of stars or other sim- 
ilar point masses in astrophysical systems, and interact 
with the rest of the system as gravitational 'sinks'. VINE 
evolves star particles using the same integration method 
used for all other particles (section[2]), and with the same 
time step criteria, though different coefficients may be as- 
signed. They interact gravitationally with all other par- 
ticles and with each other according to a softened Newto- 
nian force law, as described in section [44l Because they 
may be many orders of magnitude more massive than the 
other types of particles, they are not included in the ap- 
proximate, tree based force calculation, but instead take 
advantage of the fact that there will typically be a very 
small number of such particles to obtain forces via direct 
summation. 

The principle difference between VINE's implementa- 
tion of star particles and the simple A^-body particles al- 
ready discussed is their ability to 'accrete' particles of the 
other two types, should their trajectories bring them into 
close proximity. If activated by a switch set by the user at 
run time, VINE makes a check to determine whether any 
particles have moved to within one smoothing length, /i,, 
of each star after each timestep. When individual time 
steps are used, both the accretor and accreted particles 
must be at the end of their time step, in order to main- 
tain simplicity in the code and the integration scheme. 
When a particle is found, it is removed from the simula- 
tion and its mass and momentum are added to the star. 
In order to conserve the center of mass of the system, 
the star particle is artificially moved to the common cen- 
ter of mass of the pair. Similarly, VINE calculates the 
angular momentum of the star and fiuid particle around 
their common center of mass, then adds it to the star as 
an internal 'spin', in order for any later accounting of the 
system's total angular momentum to be conserved. 

6. BOUNDARY CONDITIONS 

In many contexts, simulations of entire physical sys- 
tems may be possible for many astrophysical systems of 
interest. Other systems may be studied only as some 
small subset of a larger whole, e.g., a star formation sim- 
ulation which models only a part of the parent molecular 
cloud. In either case, a complete model requires that con- 
ditions at the boundaries of the computational domain 
be specified in order to model the influence of matter 
from outside it. 

For particle based simulations of entire systems, 'free' 
boundary conditions require no special treatments be- 
cause the fluxes of material, momentum and energy are 
carried by the particles themselves. Particles feel pertur- 
bations only from other particles already present in the 
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simulation, not from any specific boundary, and move 
into adjacent to empty regions as conditions warrant. 
Otherwise, such regions require no computational effort. 
More complicated systems require active management of 
particles near boundaries, with specific treatments tai- 
lored to specific problems. 

For the example star formation simulations above, the 
relevant treatment will be of a volume of space sur- 
rounded by an infinite series of other, identical volumes, 
replicated in succession at greater and greater distances 
in each direction from the original. VINE includes a 
module to implement periodic boundary conditions in 
this context and we describe it below. Being much rarer 
in astrophysical contexts, we have chosen not to imple- 
ment reflecting boundaries, for which material approach- 
ing some surface is repelled with identical velocity but 
opposite sign, although a template routine has been in- 
cluded for this purpose. 

6.1. Periodic Boundaries 

When periodic boundary conditions are employed, all 
matter resides inside a predeflned simulation box. The 
boundary conditions then imply that calculations of 
physical quantities, such as gravitational forces, gas den- 
sities, etc. are carried out as if the box were surrounded 
in every direction by a series of identical replicas, extend- 
ing to infinite distance in all directions. Particles whose 
trajectories pass through one of the box's boundaries, 
entering one of the replications, are artificially restored 
to the original box on the opposite boundary. Therefore, 
after every position update, VINE performs a check for 
particles which have moved outside of the box. If any 
such particles are detected, VINE adds or subtracts the 
length of the box to the appropriate position component 
of those particles, effectively reinserting them on the op- 
posite side of the volume from the one from which they 
exited. Velocities are unchanged. 

6.1.1. Gravitational Periodic Boundaries 

Various treatments of gravitational periodic boundary 
conditions in particle evolution codes have been discussed 
in the literature. Most are based on the Ewald method 
(Ewald 192lj) and we refer readers to Hcrnquis t et al.l 
(|l99lf ) for the application to astrophysical parti cle sim- 
ulatio ns^ . Our implementation closely follows iKlessenI 
(|1997D . so here we describe only the overall approach and 
basic equations and refer the reader to his work for ad- 
ditional details regarding implementation, accuracy and 
performance. 

When a simulation uses periodic boundaries, the New- 
tonian potential of the particles must be replaced by the 
potential of an infinite, periodic replication of those par- 
ticles. Thus the Poisson equation becomes 
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n j=l 

where L is the period and n an integer vector. The gen- 

^ note however that their eq. 2.14b is missing a factor |r — nL|, 
see IKlessen|[T997l . 



eral solution can be written as 

N 

0(r) = -G^^m,f;(r-r,-nL) 

n j = l 



G 



N 



(50) 



(51) 



j=l k 



with the Green's function for the gravitational potential 
given by t/(r) = 1/r or, in Fourier space, G = 47r/fc^. 
The sum in equation [501 converges only slowly. Ewald's 
method alleviates this problem by splitting the summa- 
tion into a short range part, which converges quickly 
in real space and a long range part, which converges 
quickly in Fourier space. So the Green's function be- 
comes Q ~ Qg + with 



C/S = -crfc(ar); 
r 



Gl = -erf (ar); 
r 



GsCk) = 



A-K 
47r 



1 — exp — 



4a2 



(52) 



(53) 



where a is a scaling factor with units of an inverse length, 
erf(a;) and erfc(a;) are the error function and its com- 
plement, respectively. To balance good accuracy with 
low computational cost, the parameter values a = 2L, 
|r — nL| < 3.6L and k = 27rh/L with h being an inte- 
ger vector with | h| < 10 are a reasonable choice (see e.g. 
iHernquist et al.iri991h . The resulting equation for the 
potential is 



N 



[■) =-G^mj 



E 



erfc(a|r — — nL\ 



1 v-^47r 



and the force on a particle i is 

F{ri) = -miV(j){ri) = Grrii ^ f (r^ - r^) 



(54) 



(55) 



with the function f defined as 
r — nL 



[erfc(Q;|r — nL\ 



2a 



1 



nL \ exp( 
47rk 



Most practical implementations use a modified force 
law for computing the forces from any node or particle 
on the i nteraction list ob tained from traversing a tree 
(see e.g. iDave et al1ll997[ ). When GRAPE hardware is 
used however, no modified force law can be used, since 
only an unmodified Plummer force is programmed into 
the hardware itself. In order to preserve the possibility 
of the use of GRAPE hardware in VINE in this case 
as well, we have therefore chosen an alternate imple- 
mentation of Ewald's method (Klesscn 1997). Instead 
of modifying the force law for all interactions inside the 
simulation box, the forces from matter inside the simu- 
lation box are computed as if no boundary was present. 
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then all particles get a correction force to account for the 
contributions from neighboring boxes. As in equation 1561 
above, all pairwise interactions between particle pairs are 
included and we only want to compute the correction due 
to all i-periodic replications outside of the box, we sim- 
ply need to subtract the interaction of the pair inside the 
box: 

r 



fcor(r) =f(r) 



(57) 



Similarly the pure correction for the potential is 
'^cor(r) — 0(r) + l/|r|. Note that fcor and (j)coi actu- 
ally represent the corrective Green's functions for force 
and potential, respectively. 

To determine the correction, VINE maps the particles 
ont o a grid using the CIC (Clo ud-In-Cell) scheme (see 
e.g. 'Hock nev fc Eastwoodl[l98l l), creating a density dis- 
tribution on the grid, which is then Fourier transformed. 
The result is convolved with the Fourier transform of 
the Green's function, fcor, describing the correction and 
then transformed back into real space. Finally, the forces 
and potential are mapped from the grid back onto the 
particles, giving every particle the desired correction of 
the force and potential at the particle's location. Tables 
with the Fourier transformed correction terms are pre- 
computed, so that during the simulation only the trans- 
form of the density field needs to be calculated, then it 
is convolved with the corrections and transformed back. 
For a de tailed de s cripti on of the method, we refer the 
reader to iKlessenI (jl99l). 

The computational cost of the calculation of the cor- 
rection terms is governed first by the grid resolution and 
second by the speed of the FFT algorithm used. Klessea 
()1997f ) shows that sufficiently accurate forces can be ob- 
tained if the linear density of grid cells is at least twice 
that of the particles. For example, if e.g. 64^ particles 
are used, the grid should have a resolution of at least 
> 1283. 

Finally, if the periodic boundary conditions are used in 
connection with the individual particle timestep scheme 
(see section [2^ . the periodic correction part of the force 
does not have to be calculated on every particle timestep, 
which would otherwise be a significant overhead. In- 
stead, a CFL-like condition can be applied to calculate 
new corrections whenever a particle could have traveled 
a given fraction of a grid cell since the last time the pe- 
riodic corrections have been calculated. 

Most publicly available and vendor supplied numerical 
libraries include some variant of fast, parallel FFT al- 
gorithms. Rather than writing multiple variants of our 
Ewald code fo r each of these librarie s, we link to the 
FFTW library (jFrigo fc Johnsonll2005D . commonly avail- 
able on most computing platforms, in order to retain 
both maximum portability for VINE and to realize high 
performance on all platforms. 

6.1.2. SPH Periodic Boundaries 

Calculations of hydrodynamic quantities are always 
local in the sense that only neighboring particles con- 
tribute to any given particle's hydrodynamic quantities. 
No complex algorithms, such as Ewald's method for grav- 
ity, are required to determine forces due to particles in 
neighboring replications of the simulation box. The only 
change that must still be made is to modify the definition 



of the distance between pairs of particles, so that parti- 
cles located near a boundary see neighboring particles 
from both sides of that boundary. 

The required modification utilizes the fact that no par- 
ticle can be more distant from another than one half of 
the box size in any direction. If it were, its ghost particle 
in a neighboring box would instead be chosen as neigh- 
bor because its separation in that direction was smaller. 
We can therefore define the separation in each direction 
as 



Sx, 



L nint 



L 



(58) 



where L is the box length and 'nint' stands for the 'near- 
est integer' function and takes the value of zero or ±1, 
depending on the value of its argument. With this defi- 
nition, and accounting for the periodicity of the box, the 
separation of a pair of particles can be calculated sim- 
ply as the difference between their natural coordinates, 
plus an extra term which reduces to zero when the nat- 
ural separation is small. If the magnitude of their nat- 
ural separation is larger than L/2, the box length will 
be added or subtracted as appropriate. Combining the 
separations in each of the coordinate directions yields a 
distance. The computational cost of the extra operations 
are quite small, and no additional code infrastructure is 
required to handle the existence and storage of actual 
ghost particles. No additional infrastructure is required 
for neighbor searches either, since equation [55] is triv- 
ially generalizable to calculations involving tree nodes as 
well. After neighbor identification and distance calcu- 
lations are complete, computations of all hydrodynamic 
quantities, whether involving the distance metric directly 
or indirectly, through the SPH kernel (e.g. densities or 
velocity and pressure gradients), proceed as in the case 
of free boundaries. 

The definition of separation in equation 1581 has one mi- 
nor side effect, that any particle so large that it is a neigh- 
bor of both some other particle contained in the simula- 
tion volume and one or more of that particle's duplicates 
in neighboring volumes will find only one instance of that 
particle. Its duplicates will not be found. We consider 
this situation to be extremely unlikely for any simula- 
tion containing more than a few tens of particles, and so 
neglect the possibility entirely. 

7. TEST SIMULATIONS 

In this section we present various tests of the code, 
demonstrating the capabilities of VINE for SPH simula- 
tions as well as for A^-body simulations. 

7.1. Adiabatic collapse of a cold gas sphere 

Since lEvrardl (|1988D the adiabatic collapse of a cold, 
initially isothermal gas sphere under its own gravity has 
been a widely use d test c ase for SPH codes, s ee e.g . 
Hernauist fc Katzl (l9891: Steinmetz fc Miiller (l993j; 
Hultman fc Kallander (1997): Carraro et al. (1998); 
Springel et all (|200H ): iThacker et al.1 (|2000f) . Adopting 
a similar setup as these authors, we simulate the col- 
lapse of a spherically symmetric gas cloud with density 
profile 
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where M is the total mass of the cloud and R is its maxi- 
mum radius. For simplicity we choose a unit system with 
G = M = R — 1, again similar to previous authors using 
this test case. The particles are initially at rest and have 
an internal energy per unit mass of u = 0.05 GM/R and 
a ratio of specific heats of 7 = 5/3. 

Once released, the system undergoes rapid collapse 
from inside out and reaches maximum compression at 
t « 1.1. Slightly before this time, enough kinetic en- 
ergy has been converted into heat to build a pressure 
supported core in the center. Material that falls in later 
bounces off this core and is accelerated outward, form- 
ing a strong shock wave which interacts with the still 
infalling outer portion of the sphere. 

The simulations presented below were run using the 
Gadget MAC (equation for the tree with 9 = 10~'^. 
We used the spline kernel, equation (THl with a fixed 
length scale h = 0.02 for softening the gravitational 
potential of the particles. The hydrodynamic smooth- 
ing length was adaptive, so that each particle retains 
« 50 neighboring particles (see section at all times. 
The individual time step scheme described in section 
12.41 was used for all runs. The parameters for the time 
step criteria (see equations [TTl [T^l HH [37] and [35]) were 

Tacc = Tyel = T^a = 1, TCFL = 0.3, Th = 0.15. AU 

runs were performed with the leapfrog integrator and 
some of them with the Runge-Kutta-Fehlberg integrator 
as well, with the latter using trkf = 10~^ (see equation 
I16|) . However, the difference between the two integration 
schemes is very small (see below), so we focus on the 
leapfrog simulations for most of the analysis presented 
here. 

7.1.1. Fixed Artificial Viscosity 

The evolution of potential, kinetic and internal ener- 
gies of the system is shown in figure [2j The simulations 
have been performed at four different resolutions, using 
5000, 10000, 40000 and 80000 particles. In addition, we 
have also run a simulation with 640000 particles as an 
internal reference standard against which the other runs 
can be compared. This will make it easier to assess the 
convergence of the simulations with increasing resolution 
and the effect of resolution on dissipative effects. 

Comparing the 40000 and 80000 particle runs around 
the time of maximum compression (0.7 < t < 1.5), no en- 
ergy component is different from the corresponding value 
in the other simulation by more than a few percent, with 
the largest differences coming near t ^ 1.1, when the 
gas is maximally compressed. Close examination of the 
curves show that slightly larger differences between these 
two simulations and the much higher resolution reference 
standard are present. For example, near maximum com- 
pression, the reference run has converted somewhat more 
gravitational energy into internal energy than any of the 
lower resolution realizations. We believe this occurs be- 
cause the correspondingly lower dissipation at high res- 
olution allows for higher infall velocities and higher gas 
compression when the kinetic energy is converted into 
thermal energy. For the same reasons, the peak com- 
pression time occurs earlier as well. 

During the equilibration and late stage adiabatic ex- 
pansion phase of the system, after t « 1.5, differences 
between the 40000 and 80000 particle runs relative to 
the 640000 particle run become more pronounced. At 



progressively higher resolution, more energy remains in 
the form of internal energy, while the gravitational po- 
tential energy remains more negative, indicating that 
the gaseous core formed from the collapse is hotter and 
more tightly bound at higher resolution than lower. The 
largest deviations between the reference standard and 
either the 40000 or 80000 particle realizations occur be- 
tween 1.5 < t < 2, and thereafter decrease. Even at these 
times, the differences remain no larger than ~ 5%, and 
we consider them to be well resolved, for the purposes 
required for this test. 

At the lower resolutions of 5000 and 10000 particles, 
the internal and gravitational potential energies deviate 
from the 80000 particle run values by ~ 10 — 15% at 
maximum compression, with even larger systematic de- 
viations at late times. These realizations have clearly not 
reached the level of convergence to the proper solution 
that is visible in the higher resolution runs. 

The relative differences of a run integrated with the 
Runge-Kutta-Fehlberg integrator (not shown) as com- 
pared to a corresponding leapfrog run were never more 
than 0.63% in the internal energy, 0.44% in the kinetic 
energy and 0.40% in the gravitational energy. The corre- 
sponding difference in total energy reached a maximum 
of 0.32%. We do not consider these differences to be 
highly significant and so in the remainder of our discus- 
sions, we will present only the results of the simulations 
using the leapfrog scheme. 

Figure [3] shows the relative error in total energy as a 
function of time, for each of the four realizations at differ- 
ing resolution and our high resolution reference standard. 
In every case, the errors peak near t 1.1, corresponding 
to the most compressed state of the system. Specifically, 
the maximum errors in total energy are 0.84%, 0.91%, 
0.93% and 0.86% for the resolutions of 5000, 10000, 40000 
and 80000 particles, respectively. Although we have ob- 
served error peaks both in these simulations and in those 
of other systems (see e.g. figure [TU] below) . their origin re- 
mains somewhat unclear. One possibility is the fact that 
rather large fluctuations in the neighbor counts of SPH 
particles over time are permitted using the default set- 
tings of VINE. Similar fluctuations have been shown to 
gener ate energy errors in other work (e.g. lAttwood et al.l 
I2OOI and may also contribute here. Rerunning our sim- 
ulations with tighter restrictions on the neighbor counts 
reduced the peak only slightly however, so we could make 
no clear connection to this possibility. 

Another potential source for the energy error is that 
the SPH scheme in the current version of VINE does 
not incorporate the so called V/i terms into the equa- 
tions of motion and energy. These terms arise from 
each particle having its own, variable smoothing length 
and neglecting them can also give r ise to energy er- 
rors ([Nelson fc Papaloizoulll993l ll994f ). However, these 
terms make the SPH equations considerably more com- 
plex and have hence not been widely adopted in the lit- 
erature. A more promising and consistent alternative 
is using t he SPH formulation derived from variationa l 
principles ([Springel fc Hernquistir2002l : lMonaghanll2002D . 
which naturally takes the V/i terms into account to all 
orders without the terms being explicitly present in the 
equations themselves. Such an approach will be included 
in a future release of VINE. In the current revision, we 
note that we have been unable to remove the error peaks 
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Fig. 2. — Time evolution of the total internal, kinetic and gravitational energy of the collapse test case. Resolution ranges from 5000 
particles to 80000 particles, as shown in the box, with a run including 640000 particles run as a reference standard. 



completely through any combination of code, integra- 
tor or force accuracy settings, though their magnitude 
changes somewhat, while retaining computationally af- 
fordable simulations. Nevertheless, the overall magni- 
tude of the error peak is small compared to ambiguities 
and shortcomings in the physical models of most systems 
of interest in astrophysical contexts, and we believe that 
the level of energy conservation produced by the code 
will be sufficient to model such systems accurately. 

After peak compression, the errors fall to 0.7%, 0.74%, 
0.67% and 0.57%, respectively, and remain nearly con- 
stant for the remainder of the evolution. The 640000 
particle reference run reaches much lower errors: 0.65% 
at peak and 0.21% during the later phases of the simula- 
tion. Since both the integration accuracy parameters t^, 
TCFh, Vi and trkf and the MAC setting for the gravi- 
tational force calculations were standard values and not 
particularly tuned for very accurate time integration, we 
believe this level of energy conservation is quite accept- 
able. Decreasing these parameters easily leads to energy 



conservation to ~ 0.25% or better throughout the simu- 
lation. 

7.1.2. Time dependent Artificial Viscosity 

We have also used this test case to study the differences 
due to use of the time dependent implementation of the 
artificial viscosity (AV), as defined in equation [29l In 
this test, we study simulations at a single resolution of 
80000 particles and vary the implementation of artificial 
viscosity. For reference and comparison to the results 
above, we also include the same 80000 particle simulation 
shown in figure [U characterized by an AV formulation in 
its standard form fequation [26)) . with coefficients fixed at 
a = 1 and /3 = 2, and the Balsara correction (equation 
[28|) . For comparison, we also include the high resolution 
reference simulation, with the same formulation. Models 
with the time dependent AV coefficients (equations [29]- 
I3ip were run with different settings of the parameters 
a, and 6a, defining respectively, the value for the time 
dependent AV coefficient in quiescent flow and the scaled 
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Fig. 3. — Time evolution of the relative error in total energy for 
the runs shown in figure [5] 

length over which that coefficient decays to its minimum 
value. 

Figure |4] shows the time evolution of the total internal, 
kinetic and gravitational energy of the system for the 
80000 particle model without time dependent AV and 
for four models at the same resolution identical except 
for changes in the time dependent viscous coefficients. 
The distribution of energies among components varies 
by as much as 10-20% for different settings, with the 
two a* = 1 variations producing the largest differences 
during and after peak compression. They are peculiar 
in the sense that substantially less internal and kinetic 
energies are generated, than in the fixed coefficient case. 
This is due to the higher dissipation in the initial infall 
phase {t < 1), which acts to slow the infall. The be- 
havior appears qualitatively similar to that seen in the 
lower resolution variants in figure [2l for which we ex- 
pect a correspondence between the lower resolution and 
higher dissipation. Since, for the parameters used, the 
viscous coefficient will always be larger than unity, the 
time dependent viscosity has the effect of lowering the 
effective resolution in the simulation, but since it is ex- 
pected to be used only with a much smaller coefficient, 
the problem will not be severe in practice. 

Both models with a* = 0.1 produce similar results for 
all three energy components and, more importantly, re- 
sults which are very similar to the model with fixed AV. 
Only at and shortly after the maximum compression do 
differences appear, most visibly in the kinetic and grav- 
itational energy components, where more resides in ki- 
netic energy than in the fixed AV case, and less (more 
negative) in the gravitational potential energy. The dif- 
ferences from the 80000 run with fixed AV are at most 
a few percent over the entire duration of the simulation. 
More importantly, the models with a* — 0.1 resemble 
our high resolution reference run with 640000 particles 
better than the 80000 particle run with fixed AV. The ef- 
fect of the time dependent AV, when used with values of 
a* = 0.1, is to simulate a higher resolution/lower viscos- 
ity system, and is clearly a very desirable effect. Of the 
two models with a* = 0.1, the one with (5q, = 5 is more 



similar to the 640000 particle reference run. At maxi- 
mum compression, the minimum in the potential energy 
is not as deep as in the model with Sa = 2, and the corre- 
spondence between the kinetic energies is closer as well. 
At late times, the model with Sa = 2 tends to higher (less 
negative) potential energies, in fact higher than both the 
640000 particle run and also higher than the 80000 run 
with fixed AV. 

In figureEl we show the relative error in total energy for 
the four models with time dependent AV, the fixed AV 
reference model and the high resolution reference run. 
The high a* models have larger errors than the model 
with fixed AV, with the 5a = 5 model coming slightly 
closer to the fixed AV result than the i5q = 2 model. The 
two low a* models develop much smaller errors during 
maximum compression, but after the peak, their energy 
errors go in the opposite direction. At the end of the sim- 
ulation, the error of the a* = 0.1, (5q = 2 model reaches 
0.9%, as large or larger than the peak errors of both the 
fixed high a* models, and is still growing. The model 
with a* = 0.1, 6a — 5 shows the best performance overall 
on this problem. Interestingly, its energy error decreases 
after maximum compression, similar to the correspond- 
ing Sa ^ 2 model, but becomes almost flat and is 0.04% 
at the end of the simulation. At and around the peak, 
its behavior is very similar to the high resolution refer- 
ence run. After i « 1.2, it falls to lower errors than the 
high resolution run, but overall it is still the model with 
comes closest to the 640000 particle reference run. 

The differences seen in figures |4] and [5] are due to the 
differences in the ability of one set of AV parameters to 
model shocks and compressions with more fidelity than 
another. Because modelling such shocks and compres- 
sions is very sensitive to how AV is implemented in a 
code, we move now to a closer examination of the radial 
density profiles of the simulations, in which we expect a 
shock front to be clearly visible. 

Figure ini shows density profiles at four times during the 
evolution for each of the models, and figure [7] shows a 
close up view of the density at the shock front at i = 0.9. 
In figure [SI we show the corresponding entropy profile in 
order to facilitate a comparison with SPH variants which 
inte grate an entrop ic function instead of internal energy 
(e.g. ISpringe]|l2005[ ). Pre-shock entropy production is a 
known problem of SPH with this test case. 

For comparison, the figures also show the result of 
a one dimensional simulation of the same system using 
the Piecew ise Parabolic Method (PPM ) with 350 zones, 
taken from lSteinmetz fc Miilled ()1993[ ). We expect better 
AV behavior in the SPH runs will be refiected in curves 
more similar to that obtained from the PPM run. As an 
additional point of reference, we also show the result of 
the 640000 particle run in the close up views of the shock 
front, shown in figures [7] and [H 

In every realization, the overall density profiles are sim- 
ilar, with high densities in the center, a discontinuity fur- 
ther out, and a decrease to low densities at the largest 
radii. The similarities give confidence that the gross be- 
havior of all variants produce sensible results. There are 
differences specific to each realization however. For ex- 
ample, the high viscosity models disagree most at the 
largest radii, where the densities are overestimated rela- 
tive to the PPM model, especially at late times when the 
material is expanding freely. The low viscosity and fixed 
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Fig. 4. — Time evolution of the total internal, kinetic and gravitational energy of the collapse test case for the model with 80000 particles, 
using different implementations for the artificial viscosity. In addition, a high resolution reference run with 640000 particles is plotted as 
thick dash-dotted line. 

AV models agree well with the reference PPM curve there 
at all times. At small radii, all the SPH models under- 
estimate the central density. The low viscosity models 
(a* = 0.1) agree best, followed by the fixed AV and long 
decay constant, high viscosity model and, worst of all, 
the model with a* = 1 and 6a = 2, for which a peak 
density nearly a factor two below that of the PPM run is 
observed. The central densities agree with the PPM real- 
ization to within ~ 3.7% in the 640000 particle reference 
run at t = 0.75 (though are not plotted in figure [6l in 
order to reduce clutter) and, at late times, this SPH run 
actually overestimates the central density by as much as 
10%. Using the central density as a metric, the models 
with 80000 particles used for this AV test are not fully 
converged, but the similarity between the PPM and SPH 
methods at higher resolution indicate that they do con- 
verge to very similar results, given sufficient resolution. 
During the compression phase, the high viscosity mod- 



els show significant broadening of the shock front, as is 
expected, but also an outwards position shift of the foot 
of the shock with respect to both the PPM results and 
the other SPH models, especially visible in the t = 0.9 
close up view in figures [7] and [H The low viscosity mod- 
els reproduce the density at the shock front more accu- 
rately and at the same radial position as the PPM model. 
In front of the shock front (radially outwards), the den- 
sities in both low viscosity models are very similar to 
each other and to the fixed AV model, while behind it, 
the a* = 0.1, <5c[ = 2 model exhibits a much shallower 
rise than it should. The longer decay constant variant 
(a* — 0.1, Sa — 5) shows a much steeper density profile 
there, but still not as steep as the fixed viscosity version, 
which appears to be the best reproduction of the shock 
front. The high resolution model shows closest corre- 
spondence to the PPM density curve at the foot of the 
shock, but the high density side nearly over lies its lower 
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Fig. 5. — Time evolution of the relative error in total energy for 
the runs shown in figure |4] 

resolution cousin, also with fixed AV. Later, near maxi- 
mum compression (t = 1.05-lower left panel of figure [S]), 
simulations with all of the AV implementations misplace 
the shock front, to slightly larger radii. 

The entropy profiles shown in figure [5] exhibit behav- 
ior consistent with that shown for the density profiles. 
The high viscosity settings (i.e. with a* = 1) result in a 
wider shock front than the low viscosity (a* — 0.1) set- 
tings, which in turn show the front at larger radii than 
the fixed AV settings. Of the latter, the reference run 
with 640k particles reproduces the PPM reference lo- 
cation of the front very well. All of the time varying 
viscosity settings generate a shock front at larger radii 
than either the fixed viscosity settings or the PPM refer- 
ence. All settings exhibit the problem, common in SPH, 
of pre-shock entropy generation, though at varying lev- 
els. The low viscosity models do slightly better at radii 
r > 0.22, but worse at smaller radii, closer to the front 
itself. Both also overpredict the entropy peak, similar 
to the high resolution model with fixed AV. Behind the 
shock front, all models produce lower entropies than the 
PPM reference run. The high viscosity, longer decay set- 
tings are closest to the reference, while the low/short set- 
tings underpredict by the largest margin. The low/long 
model (a* = 0.1, (5q, = 5) most closely matches both the 
low and high resolution fixed AV models interior to the 
shock, recovers the shock position and width better than 
the settings of any other time varying model and pro- 
vides less pre-shock heating at larger radii than the fixed 
AV model at the same resolution. 

Given the behaviors we see in the shock structure and 
in both the components of the energy and the conserva- 
tion of total energy, we conclude that the time dependent 
AV formulation allows a simulation to be performed with 
dissipation from artificial viscosity at a level comparable 
to that of a much higher resolution simulation that does 
not include such a formulation. Our results show that 
the parameters a* = 0.1, (5q = 5 should be used for best 
results if time varying viscosity is desired. Simulated 
with these parameters, our test system yielded the best 
overall conservation of energy, closest correspondence of 



all energy components to a high resolution, fixed AV ref- 
erence model, at the cost of a slightly more broadened 
shock front compared to a run with fixed AV. Otherwise 
the shock capturing abilities are comparable to the stan- 
dard formulation with fixed AV. In the absence of shocks, 
the time dependent AV formulation yields better results 
than the time independent formulation. Therefore we 
favor the use of the time dependent formulation. 

These settings di ff er fro m those recommended by 
iMorris fc M onaghanI ()1997D . of (5q 2 for the decay 
length. Such differences illustrate the problem depen- 
dent nature of the settings themselves, and are most 
likely consequences of the presence of shocks of differ- 
ent strength in different problems. Users of VINE who 
wish to employ the time dependent AV formulation may 
need to account for such differences when choosing ap- 
propriate viscous parameters. 

7.2. Elliptical- Elliptical Merger Tests 

The quality of hydrodynamic simulations with VINE 
has been studied in detail in section FTTl In order to as- 
sess vine's ability to correctly evolve A^-body models, 
we compare the evolutionary trajectory of models simu- 
lated with VINE and with Gadget-2. We use a merger 
simul ation of two elli ptical galaxies (a 'dry merger', see 
e.g. iNaab et all l2006a^ as a test problem. As the gas 
fraction in such systems is very low, a pure A^-body rep- 
resentation of the galaxies, or one including only a small 
fraction of gas particles, is a realistic model. The evolu- 
tion of such a dry merger requires correct behavior of the 
code over a wide range of dynamical timescales. While 
the merger event itself is short compared to the dura- 
tion of disk galaxy mergers, its correct simulation re- 
quires accurate time integration of particle trajectories 
in a highly time dependent gravitational potential, re- 
quiring a wealth of interesting dynamical behaviors to 
be accurately modeled. The correct simulation of such a 
system is therefore a challenging test case. 

The initial conditions for our tests are created by set- 
ting two elliptical galaxies on a parabolic orbit, which is 
a reasonably realistic setup for the orb it in a cosmolog- 
ical context ([Khochfar fc Burkertll2006l ). Each elliptical 
galaxy is understood to be the remnant of a prior disk 
merger event whose components had a 1 : 1 mass ratio and 
in which each of the spiral galaxies consisted of a stellar 
disk and bulge as well as a dark matter halo, which in 
these tests were modelled with 60000, 20000 and 120000 
particles, respectively. Therefore, each elliptical galaxy 
in our test simulation consists of 160000 stellar particles 
and 240000 dark matter particles so that the entire sim- 
ulation contains 8 00000 partic l es. Fo r more details, we 
refer the reader to lNaab et al.l ()2006aD . 

7.2.1. Requirements for sensible code to code comparisons 

For any physical model sufficiently complex to be in- 
teresting for anything more substantial than an academic 
exercise, the only option for validating the numerical 
code used to simulate it is to compare against results 
obtained from other codes. In this sense, validation ef- 
fectively assumes that while those other codes have been 
developed to solve the same set of equations, they have 
been constructed differently enough, and have been vali- 
dated independently based on a sufficiently different set 
of criteria, to ensure that they are effectively neutral 
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standards against which the results of another code can 
be measured. The question of whether or not this as- 
sumption itself can be validated is unlikely ever to be 
answered satisfactorily. Nevertheless, we will proceed as 
if it has been, and compare the results of a simulation 
rim using VINE with the results of the simulation using 
the same initial conditions but evolved with the pub- 
licly a vailable and widely used Gadget-2 code of lSpringel 
(|2005h . 

Of course, various input settings used for a particular 
simulation will directly affect both its accuracy and the 
overall results that are obtained. It is trivial to run one or 
both codes with inappropriate parameters for the given 
problem and thus find disagreement in the results. Our 
first step in making comparisons must therefore be to 
minimize the differences originating in different settings 
by running an extensive set of tests prior to the actual 
comparisons that are our main interest. Only when this 
process is complete can we attempt to characterize re- 



sults obtained from each code on an even handed basis. 

For our comparisons, VINE uses its leapfrog integrator 
(section [2.2p with individual timesteps (section [2. 4p and 
the Gadget MAC (equation |42]). The Gadget-2 code uses 
similar, but not identical, schemes for each of these three 
choices. With these choices given, there still remains a 
very large number of additional degrees of freedom defin- 
ing all of the different code settings available to the user, 
which are both different in each code and, to the extent 
that a correspondence exists, may require different val- 
ues to produce results of similar accuracy. It serves little 
purpose to match the settings of every one of these pa- 
rameters, especially if the normal settings for one or the 
other code are significantly different. Instead, we choose 
to match the results of the codes to what we believe are 
among the two most crucial parameters: first the gravi- 
tational force accuracy, as controlled through the MAC 
parameter 9 f equation [42|) . and second, the tolerance pa- 
rameters, r, used in the integration scheme (equations 
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[nHia). 

Even with only two parameters to adjust, some dif- 
ferences are unavoidable. For gravity, VINE uses a 
multipole series truncated at quadrupole order and 
the hexadecapole b ased Gadget MAC described in 
iSpringel et al.l (|200lD . while Gadget- 2 uses a monopolc 
truncated series and a quadrupole based form of the 
MAC. Each code will therefore require different values of 
the MAC parameter 9 to obtain similar force accuracies. 



When the mass distributions within a given tree node are 
very inhomogeneous, as will certainly be the case in the 
intermediate stage of this simulation, truncation errors 
in the multipole summation will be systematically larger 
than for other distributions, resulting in larger errors for 
a given setting. In addition to the MAC differences, both 
codes test whether or not a particle and node overlap in 
space, but use different procedures to define approximate 
sizes for the nodes. While Gadget-2 errs conservatively 
by requiring the particle to lie outside of a box which 
is ~20% large r than the actual node (see equation 19 in 
iSpringell [20051 ). VINE uses a conservative definition of 
the node size itself (see Vine2), which also errs by over- 
estimating the node size. Which variant yields the more 
efficient criterion is not clear. 

Looking at the time step criteria, Gadget-2 does not 
use timestep criteria that include the velocity (equations 
[T2l - [T3|) . Also the actual equation implemented for the 
force criterion, equation IIH included the tolerance pa- 
rameter under the square root. So we expect to find 
similar integration accuracy for both codes at different 
values for the accuracy parameters. For simplicity in the 
discussion below, we will denote the accuracy parame- 
ters with Tacc^ but actually include all three r values in 
equations [TT1[T3l in VINE (set to the same value) . We 
will use the same symbol, Tacc, for Gadget-2 to mean the 
single active accuracy r parameter for Gadget-2, despite 
the different functional form of its implementation. 

7.2.2. Determining comparable integrator and tree opening 
parameters for each code 

We constrain the values of the parameters using a two 
step process. First we determine node opening parame- 
ters which yield similar force errors for the particles. For 
both codes, we require that 99% of the particles have 
force errors < 0.2% and that the remaining 1% of the 
particles have gravitational force errors < 1%. To some 
extent, these limits serve as proxies for the full error dis- 
tributions of all particle forces calculated by each code. 
More importantly however, we note errors in the evolu- 
tion of a simulation can be dominated by large errors in 
only a few particles. In order to eliminate such sources 
from consideration (or at least make them similar in mag- 
nitude!), we require that each code produce similar error 
limits (i.e. single points on the error distribution curves) 
rather than similar error distributions. Wc use two lim- 
its in order to ensure both that the full distribution of 
errors is sufficiently low and also that a small tail of the 
distribution with high errors cannot adversely affect the 
simulation outcome. 

For the second step, we use these MAC values in sev- 
eral full simulations with each code, in which we vary 
Tacc and we require that the error in total energy never 
exceed 1% at any time during the simulation. While any 
particular choice of error limits is somewhat arbitrary, 
some choice must be made. These values reflect those 
used most frequently in our production runs. 

The setup was first simulated with VINE at very high 
accuracy. From this reference run we selected snapshots 
at three different times corresponding to the initial con- 
dition, just prior to the merger so that one galaxy is 
located immediately adjacent to the other, and a very 
late stage, well after the strongest interactions between 
the galaxies are complete. We will refer to these three 
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stages as the initial, intermediate and late stages, respec- 
tively. For the VINE calculations, accelerations for each 
snapshot were calculated directly from the code, with 
the results saved to files for later analysis. In order to 
compare identical mass distributions with the two codes, 
the three snapshots of the reference run with VINE were 
converted to a data format readable by Gadget-2. Each 
converted snapshot was read in to the Gadget-2 code 
which calculated an independent reference gravitational 
force for the particle distributions, as well as forces at 
various values for the accuracy parameters, again saving 
the results for later analysis. 

For all three stages, we investigated the relative error 
in gravitational forces as a function of the tolerance pa- 
rameter used in the relevant MAC. Our procedure of 
comparing three different stages of the simulation still 
does not necessarily guarantee that the force accuracy 
over the entire simulation will remain comparable, but 
we have some confidence that the deviations will not be 
too large because we test three very different mass distri- 
butions, representative of the range of mass distributions 
covered by the simulation. 

Figure [9] shows the error magnitudes for the median, 
95% and 99% error limits for each code at three different 
times in the simulations. As expected from the differ- 
ences in the MAC definitions, comparison of the error 
limits for the same value of the opening criterion show 
that the limits for VINE are not the same as those for 
Gadget-2, with the latter being considerably larger. The 
differences are reflected in the required opening param- 
eters: in order to fulfill the error limits described above, 
we require a value of = 5 x 10~^ for VINE and 9 = 10~^ 
for Gadget-2. Note that the points corresponding to the 
next larger 9 value for which we determined force errors 
(i.e. e = 10-2 for VINE and 61 = 5 x lO'^ for Gadget- 
2) both still violate our criterion of 99% of the particles 
having force errors < 0.2%, although the correspond- 
ing points in the bottom panels of figure [5] appear to be 
just below this limit. The parameter value required for 
Gadget-2 lies well below that at which the late time er- 
ror curve diverges from the curves defining the initial and 
intermediate time error limits. Its value is therefore not 
determined by any peculiarities of the mass distribution 
which caused the divergence in the first place and we do 
not expect that its value will be significantly altered in 
other morphologies. 

Beyond the simple statement of the MAC parameters 
required for each code, it is of some interest to exam- 
ine figure [9] further, to study the differences between the 
codes and, hopefully, to better understand the conse- 
quences various choices of the criteria may have on the 
simulations. 

For VINE, the error magnitudes for all three limits 
follow similar patterns as functions of the opening pa- 
rameter, at all three times. The error limit at the most 
permissive opening criterion for which we obtained data 
terminates well below the 1% level. Also, the error limit 
curves appear to decrease their slope at larger log 6* val- 
ues, rather than continuing a linear increase. Detailed 
tests discussed in Vine2 demonstrate that both phenom- 
ena continue towards still larger 9 values and that the 
error limits never increase beyond 2 — 3%, even in the 
limit of a uniform density (where we expect small force 
magnitudes due to the greater level of cancellation of par- 



tial forces), a fact that will be very beneficial in general 
for most simulations. 

For Gadget-2, the error limit curves of the initial and 
intermediate time particle distributions lie virtually on 
top of each other. The late time curve deviates some- 
what above 9 > lO^^, exhibiting a population of parti- 
cles with force errors approaching 6%-a level unaccept- 
able for most simulations of astrophysical interest and 
as much as a factor of 2 larger than either the initial 
or intermediate time curves. Also, the error limits for a 
given opening criterion do not appear to reach any limit- 
ing values as 9 increases towards more permissive limits. 
These observations are important because they demon- 
strate that the node opening criterion must be chosen 
with some care in Gadget-2, if large errors are to be 
avoided. 

With the MAC parameters defined, we turn now to 
a set of simulations in which we varied the integration 
accuracy Tacc, using the same elliptical merger setup as 
above. Our criterion for sufficient integration accuracy 
is to impose an upper limit on the error in total energy. 
Adopting the force accuracy parameters 9 from above 
and using different values for the integration accuracy 
parameter Tacc, we evolved our elliptical merger with 
VINE and Gadget-2 for 3.9 Gyr. However, as discussed 
in more detail below, wc find that the maximum energy 
error during the simulation with VINE depends not only 
on the integration accuracy Tacc, but also on the force 
accuracy parameter 9. They span a two-dimensional pa- 
rameter space. So we adopt the 9 values defined above 
as an upper limit which guarantees to fulfill our force 
error criteria, but we explore also the influence of lower 
9 values on the maximum energy error occurring in the 
simulation for both codes. 

As discussed in section [^751 the choice of Atmax is effec- 
tively also a time step criterion, so that Atmax and Tacc 
actually span another multi-dimensional parameter space 
of computational cost and accuracy. In this comparison 
however, we vary only the Tacc parameter, keeping Atmax 
fixed at 6.55 Myr for both codes, since we are interested 
in matching the simulation parameters between the two 
codes to compare their results at similar levels of error. 

In figures (TU] and [11] we show the error in total en- 
ergy for Tacc of 0.5 and 1.0 for VINE and 0.04, 0.05 and 
0.06 for Gadget-2, respectively. The overall shape of the 
energy error curves for the VINE simulations are quite 
different from those for the Gadget-2 runs. While the 
VINE simulations show sharp peaks at the times of the 
first close encounter and the final merger, the simula- 
tions with Gadget lack a similar feature, although with 
9 — 10"^ a slight bump is visible. However, the Gadget 
simulations exhibit a slow secular drift towards energy 
loss from the system, with the magnitude of the loss be- 
ing larger for the more permissive integrator settings. 

The error peaks in the VINE simulations rise to 1.35% 
for 6* = 5 X 10-3 and 1.14% for 6* = 2.5 x lO'^. Only for 
lower values of 9 are the errors lower than our required 
limit of 1%. If the integration accuracy parameter Tacc 
is lowered from 1 to 0.5, the peaks rise to slightly lower 
values, but still only satisfy our limit of 1% when 9 is 
decreased to 10"^. We have verified that still lower val- 
ues for Tacc than those shown in figure [TU] do not fur- 
ther decrease the peak in the energy error. Hence the 
dominant effect which determines the maximum energy 
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error with VINE is the force accuracy and not the inte- 
gration accuracy. Obviously the two parameters span a 
two-dimensional parameter space and in order to achieve 
low overall error with VINE, decreasing the integration 
accuracy parameter Tacc only leads to better energy con- 
servation if the force accuracy parameter 9 is set to low 
enough values. In order to meet our criterion of less 
than 1% error in total energy over the course of the en- 
tire simulation, VINE requires a combination of accuracy 
parameters 9 = 10~^ and Tacc = 1- 

For Gadget-2, there is also a dependence of the max- 
imum energy error on the force accuracy parameter 9, 
but the dominant factor is the integration accuracy pa- 
rameter Tacc- Hence we keep the force accuracy param- 
eter 6 = 10~^ derived from the force errors as above 
when investigating the integration accuracy. For values 
of Tacc = 0.06 and 0.05, the maximum energy errors grow 
beyond our limit of 1%. With Tacc — 0.04 the maximum 
error is 0.9% and we adopt this value for our code com- 
parison. 

Finally, we also note that the secular energy drift in the 
Gadget simulations may not be a problem of Gadget-2 



itself, but rather that the choice of Atmax is too large. In 
conjunction with the differently implemented error crite- 
ria, this could permit a sufficiently large number of par- 
ticles to be integrated forward on time steps too large 
to provide good fidelity. We have experimented with 
smaller Atmax values for Gadget-2 and have found that 
in fact, the drift can be alleviated with a smaller value, 
though at the cost of increased computation. 

7.2.3. Results of full scale merger simulations run with 
VINE and Gadget-2 

Using the above choices for the accuracy parameters, 

1. e. 9 = IQ-^ and Tacc = 1 for VINE as well as 6* = 10"^ 
and Tacc — 0.04 for Gadget-2, we ran a full scale elliptical 
merger model for 3.9 Gyr with both VINE and Gadget- 

2. The force errors and total energy errors for the two 
codes were discussed in section [7.2.21 Here we focus on 
a comparison of the results of these simulations. 

A direct examination of the overall morphologies of 
the mergers at different times, and as realized by the 
two codes, does not prove to be of great use to compare 
differences in the evolution provided by one code over the 
other. The features of the merger as evolved using one 
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Fig. 12. — Center of mass trajectories in the orbital plane for 
the stellar component in each galaxy. The symbols along the tra- 
jectories represent the following times in Gyr: 0.25, 0.5, 1.0, 1.25, 
1.5, 2.0, 3.5 
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Fig. 10. — Relative error in total energy for an elliptical merger, 
using different integration and force accuracy parameters with 
VINE 




-0,020 1 

1 2 ,3 4 1 2 3 4 

t [Cyr] t [Gyr] 

Fig. 11. — Relative error in total energy for an elliptical merger, 
using different integration and force accuracy parameters with 
Gadget-2. 

are present in the other, and we are not able distinguish 
between them in any quantitative fashion. Instead, and 
in order to compare the resuhs of both codes more quan- 
titatively, we explore the behavior of the centers of mass 
of the particles making up the stellar component of each 
galaxy. The stellar component, as the most luminous, 
is most interesting to compare because it resides in the 
central part of the galaxy and dominates the potential 
there. Tracking its behavior therefore means that we are 
tracking the dominant component of the most dynamic 
region during the merger. 

Figure [T^ shows the center of mass trajectories of the 
stellar component of both galaxies, as realized by each 
code. The trajectories of the galaxies simulated with 
VINE and Gadget-2 lie essentially on top of each other 
at all times. We have also plotted symbols along the 



trajectories to indicate different epochs in the evolution 
which have been chosen to sample the trajectories over a 
wide range of evolutionary stages. The position of each 
snapshot along the trajectories also matches between the 
Gadget-2 and the VINE simulation. 

In each case, the two galaxies have a first close en- 
counter after which they move to larger distances again. 
Then they reach a maximum separation, turn around and 
approach each other again and finally merge. During and 
after the final merger, the center of mass trajectories of 
the stellar components do not converge to a single point, 
even though the galaxies as a whole merge. Instead, they 
move apart as a consequence of the asymmetric distribu- 
tion and trajectories of tidal debris well outside the newly 
formed elliptical galaxy (some « 15% of the total stellar 
mass) which was created both during the merger simula- 
tion and assumed in the progenitor ellipticals, themselves 
taken to be remnants of mergers. The tidal debris is ac- 
celerated outward during the merger event and thus the 
center of mass of the star particles of a progenitor galaxy 
moves gradually away from the center of the new ellip- 
tical, as the tidal material moves to increasingly larger 
distances. The center of mass of the system as a whole 
is of course not affected by this behavior. 

We conclude our comparisons of the Gadget-2 and 
VINE simulations by comparing the merger remnant pro- 
duced by each simulation. In figure fT3l we plot the rota- 
tion curve of the galaxy and its surface density profile. 
The circular velocity profile of the simulations lie nearly 
on top of each other for both stars and dark matter, as do 
the surface density profiles over nearly their entire range. 
Only in the outer regions of the galaxy, where resolution 
begins to degrade, do very small differences become vis- 
ible. Thus the mass distributions of both the luminous 
and the dark component of the merger remnant agree 
well between the two codes. 

Given the complexity of the problem and the very dif- 
ferent features and algorithms implemented in the two 
codes, we conclude that both codes agree very well on 
this demanding A^-body test case. 
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Fig. 13. — Rotation curve (upper panel) and surface density 
profile (lower panel) of an elliptical-elliptical merger. The solid 
lines are results using VINE for the simulation, the dashed lines 
show Gadget-2 results. 

8. PERFORMANCE OF THE CODE 

In Vine2 we present detailed timings of the code on 
both serial and parallel test cases. Various optimizations 
to the code are described and their effects on the perfor- 
mance of the code investigated. We will therefore leave 
our most detailed discussions of the code's performance 
for Vine2, and perform only speed comparisons between 
VINE and Gadget-2, using the same simulation used for 
the accuracy comparisons in section 17.2.31 The perfor- 
mance comparison to Gadget-2 will also serve as a frame 
of reference for the timings presented in Vine2. 

8.1. Speed comparison of VINE and Gadget-2 

For this analysis we use snapshots of the simulations 
output by each code at five different times during the sim- 
ulation, to account for variations in the speed due to the 
different mass distributions, seen at different times. We 
adopt the same node opening criteria as in section [7.2.21 
which yielded similar force accuracy for each. Specif- 
ically, we used parameters of 6* = 1 x 10~^ for VINE 
and Gadget-2, which guarantee force errors of less than 
0.2% for 99% of the particles. As before, we convert an 
output dump from the VINE version of the simulation 
to Gadget-2 format, and use that particle distribution in 
the rate calculation in order to eliminate ambiguities due 
to differences in particle distribution in the calculations 
of the rates. For internal consistency, we also calculate 
separate 'reference' accelerations for each code using this 
particle distribution, for use in the relative error deter- 
minations. 

Figure [H] shows the speed of the gravitational force 
calculation on all particles using these accuracy param- 
eters on one processor of the SGI Altix at the Univer- 
sity Observatory, Munich, i.e. one Itanium 2 processor 
rimning at 1.5 GHz. The snapshot at t = is the initial 
setup, where the two galaxies are still well separated. At 
t = 196 Myr, the galaxies have their first close encounter. 
They pass by each other and their distance increases un- 
til the turnaround point. The snapshot at t — 525 Myr 
is shortly before this point. Then the two galaxies merge 
and form a new elliptical. The snapshot a,t t — 1.25 Gyr 
is after the merger but before the system has had time 
to relax completely. The final snapshot a,t t = 1.97 Gyr 
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Fig. 14. — Calculation rate in particles per second for the grav- 
itational force calculation of all 800000 particles in an elliptical- 
elliptical merger simulation performed on one processor. The re- 
sults are shown for different stages of the simulation. 
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Fig. 15. — Speedup relative to one processor for the gravitational 
force calculation of all 800000 particles in an elliptical-elliptical 
merger simulation performed on eight processors. The results are 
shown for different stages of the simulation. 



is at a very late stage when the newly formed system 
has already largely relaxed. For the five snapshots from 
t = to t = 1.97 Gyr, the speed of the gravitational 
force calculation using VINE ranged between ^ 28600 
and ~ 34000 particles per second in serial model, while 
that using Gadget-2 between ^ 6200 to ^ 7100 particles 
per second. These rates correspond to rate differences of 
a factor of ~4.6-4.9 in serial mode for VINE over Gadget- 
2. 

The parallel speedup of the gravity calculation on eight 
processors, relative to the serial calculation, is shown in 
figure [TSl for both codes at the same times as are shown 
in figure [TH In both cases, the scaling is very good. 
At various times during the evolution VINE's speedup 
ranges between ~ 7.6 to just under a 'perfect' scaling of 
a factor 8, while Gadget-2 's scaling ranges betwen ^ 5 
and ~ 5.4. For the eight processor calculation, these 
parallel speedups correspond to rates ranging between 
- 231200 and - 258900 particles per second for VINE, 
while Gadget-2 ranges over rates from ~ 31300 to ~ 
37200 particles per second. In terms of the difference 
in the calculation speeds of VINE and Gadget-2, these 
speedups, yield a difference of a factor of ~ 6.7 — 7.6 for 
VINE over Gadget-2. 

The scalings vary at different times in the simulation 
due to the fact that the particle distributions themselves 
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change over time, becoming both clustered in the core 
regions of the galaxies and extended in their tails. In 
turn, such clustering causes more work to be required 
per update for some particles than for others or, in other 
words, it causes load imbalance to develop between one 
processor and another as this work is assigned unequally 
to each. For simulations of this (comparatively small) 
size, such load imbalances will be more difficult to over- 
come, since the total work is small. We see however, that 
vine's use of the OpenMP 'dynamic' loop scheduling 
strategy to assign small groups of particles to processors 
as they become idle, provides somewhat better perfor- 
mance in this case than the code specified distribution 
of particles provided by the distributed memory strategy 
in Gadget-2. We expect that this difference will become 
smaller for larger problem sizes and, perhaps, even be 
reversed at very large sizes due to the synchronization 
overhead required for OpenMP to parcel out additional 
work to many processors at once. Where the crossover 
point occurs, and indeed whether it occurs at all, is dif- 
ficult to ascertain. 

In addition to the timings of a single force computation 
at several times, we have also measured the time for the 
full simulations. Both codes were run in parallel on 8 pro- 
cessors of the same SGI Altix machine used for the force 
calculation timings, on the same elliptical merger simu- 
lation. VINE required 78445 seconds of wallclock time 
to complete its version of the simulation, while Gadget-2 
required 291076 seconds, a factor of '--^ 3.7 longer than 
VINE. 

Both the snapshot and full simulation rates favor VINE 
over Gadget-2. However, we recall that although we 
have attempted to use code settings that yield similar 
error limits for both codes at all times, differences be- 
tween the resulting error limits will inevitably remain, 
both between the two codes and for the same code at 
different times. Therefore, before concluding that simu- 
lating merger evolution is indeed faster with VINE than 
with Gadget-2, we must answer two important questions. 
First, are the differences due to some unanticipated bias 
in the code settings, differences in the error limits ob- 
tained from each, or details of calculations themselves? 
Second, why are the full simulation ratios smaller by a 
factor ~ 1.9, than the speed differences just quoted for a 
single gravity calculation? 

A prime suspect in evaluating speed differences for full 
simulations is whether or not the total number of force 
calculations performed over the simulation is similar. 
Analysis shows that these numbers are in fact very simi- 
lar for both codes, 4.678 x 10^ for VINE and 4.391 x 10^ 
for Gadget-a difference of only a factor ~ 6% between 
the two. Therefore, the speed differences we see cannot 
originate from this source. 

Next, we consider biases due to error limit differences. 
The node opening parameter settings used in our test, 
figure [9] shows that the error limits are nearly identical 
at all three times for both codes. Therefore we do not 
expect large variation of the actual force errors during the 
simulation. However, the gravitational error limits alone 
would allow for a slightly larger value for the accuracy 
parameter 9 in VINE and only the time evolution of the 
error in total energy forced us to adopt a lower value for 
9 for our timings. 

An additional difference is that VINE always computes 



both the gravitational potential and the force, while 
Gadget-2 computes only the force. While it is possible 
for Gadget-2 to compute the potential as well, it requires 
a second tree traversal and computation, effectively dou- 
bling the time for the combined calculation. Such a fun- 
damental difference between the algorithms in the codes 
would unfairly distort the comparisons between the tim- 
ings in the two codes, so we have compared the com- 
bined force and potential calculation of VINE with the 
force only calculation of Gadget-2, except for snapshot 
dumps, for which both are computed. We conclude that 
the calculation of the gravitational forces in our test case 
is considerably faster with VINE than with Gadget-2. 

Finally, we consider the differences between the speed 
ratios for the full simulations and for the snapshot grav- 
ity calculations. The full simulation timings include, of 
course, contributions from other code operations than 
the gravity calculations themselves, such as the leapfrog 
updates, and tree reconstruction and revision. They also 
time the same overall activities in ways that are not cap- 
tured by a single snapshot timing, such as when only a 
fraction of particles are active. 

Both VINE and Gadget-2 emit statistics on the costs 
of various operations performed by the code, though the 
detailed metrics reported in either case are not identi- 
cal. According to the code-reported timings for each, 
VINE and Gadget-2 require some - 35000 and - 255000 
seconds'", respectively, for 'gravity only' portion of their 
total simulation timings. These timings preserve the 
VINE/Gadget-2 speed ratio of 7 reported for single 
time snapshots of the gravity calculations done on 8 pro- 
cessors. 

Full simulation timing differences must therefore origi- 
nate in other portions of the calculation. Specifically for 
VINE, the remaining contributors to the total simulation 
time are dominated by the tree rebuild and revision op- 
erations. Much of the tree revision cost can be associated 
with systematic deficiency unique to the Altix architec- 
ture. Specifically, we found in Vine2that parallel scaling 
of the tree revision is particularly poor, saturating at a 
speedup of only ~ 2 even at very large processor counts. 
With scaling more typical of that seen on other architec- 
tures vine's performance would no doubt improve. 

Excluding the force calculation, the most costly activ- 
ity in the simulation is not tree revisions, but rebuilds. Of 
some relevance is our conclusion in Vine2that the overall 
cost of tree rebuilds may be amortized over more or less 
total simulation time by tuning the frequency of rebuilds 
vs. revisions. The cost must be balanced against an in- 
crease in the force calculation times however, and our 
timing incorporates the settings, believed to be near op- 
timal, recommended in Vine2. To explore this question 
further, we reran the VINE simulation with a more per- 
missive setting (less frequent rebuilds) and found that 
indeed while rebuild costs dropped substantially, force 
calculation costs increased even more, resulting in poorer 
performance overall. 

We conclude with the observation that the merger sim- 
ulation chosen for our comparison study proves to be a 

^ For consistency with the single snapshot calculations above, 
the latter value excludes a contribution from calculations of the 
gravitational potential done by Gadget-2 for snapshot dumps, in a 
separate step from the force calculations 
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remarkably challenging configuration for both VINE and 
Gadget-2, but for different reasons. For VINE, the tree 
rebuild and revision costs prove to be substantially higher 
than we typically observe in simulations of other physi- 
cal systems, for which we typically find that they com- 
pose at most 10-20reduced parallel scalability in figure 
[T51 is reported by the code itself to comprise approx- 
imately half of the total run tim e. Similar scal ability 
issues were originally discussed by iSpringell ([20051) for a 
much smaller problem consisting of 60000 particles, and 
attributed to the fact that some particle configurations 
prove to be quite challenging to the domain decomposi- 
tion algorithms used to distribute work across multiple 
processors. 

Although the actual speed differences and scaling be- 
tween the codes will likely vary somewhat between dif- 
ferent problems, we conclude that VINE will be an ex- 
cellent choice for use in efficiently solving most problems 
encountered in astrophysical contexts. 



9. SUMMARY 

In this paper we have introduced VINE, a hybrid A^- 
body/SPH code which uses a binary tree structure for the 
calculation of gravitational forces. It is a very modular 
and flexible code which allows the user to compile and use 
only those modules which are required for simulating the 
physics of the problem at hand. This modular structure 
also makes it fairly easy to exchange such modules for 
new ones if needed or to add others to implement new 
physics or numerical features. 

The code includes both a Runge-Kutta-Fehlberg and 
a leapfrog integrator, which can be chosen by the user at 
compile time. Both can make use of an individual parti- 
cle timestep scheme. We have described the SPH imple- 
mentation in VINE, including details of the symmetriza- 
tion and the scheme for adapting the smoothing lengths. 
VINE includes the capability for calculating gravitational 
or other long range forces using a tree structure to or- 
ganize and sort particles into near and far interactions, 
and an outline of the techniques are described here. We 
describe the implementation of periodic boundary con- 
ditions used in both the gravity and the SPH part of 
the code. Finally, we demonstrate the capability of the 
code to accurately simulate both hydrodynamic and N- 
body problems using, in the first case, the collapse of 
a gas sphere as a test problem and, in the second, an 
elliptical-elliptical galaxy merger. 

We have demonstrated that the code performs well on 
a standard, but somewhat contrived, test problem with 
a well known result-the Evrard collapse problem. More 
importantly, we demonstrate that it performs well on a 
full astrophysical simulation of the merger of two ellip- 
tical galaxies, in comparison to the publically available 
Gadget-2 code. The performance of the gravitational 
force calculation in this test, the most costly component 
of simulations including self-gravity, was superior to that 
of Gadget-2. The speed difference ranged between 4.6- 
4.9 times faster on one processor at different times in the 
calculation, with better parallel scaling than provided by 
Gadget-2 as well. For a full simulation timing on 8 pro- 
cessors, we find a smaller difference, with VINE being 
only 2.9 times faster. 



9.1. Additional optimizations, features and future 
directions 

VINE, as it has been presented here and in Vine2, will 
be released to the pubhc under GNU Public License. We 
hope that it will become a useful tool for use on a wide va- 
riety of problems. At present, the code exists in a flexible, 
'base' version, which includes a number of basic physics 
packages common in many astrophysical contexts, but 
is by no means complete. We expect that other workers 
will wish to incorporate packages not included in the base 
version, or to make changes to it in order to advance their 
own research goals. As one of its strengths for users, we 
expect that this process will not be overly burdensome 
to its users, due both to the features included in its base 
version and the overall modular design. In the future, we 
expect to push the code to become still more of a more 
general purpose 'multi-physics' code, for use in a wider 
variety of contexts. 

Already, efforts are underway to incorporate such ad- 
ditional physics as required to satisfy our own research 
goals, for example. At the University Observatory in 
Munich, efforts are underway to develop modules to im- 
plement radiative cooling, subgrid models for star for- 
mation on galactic scales, stellar feedback, black hole 
accretion, AGN feedback, inflow boundaries and ioniz- 
ing radiation. These packages are at many different 
stages of development, with some essentially complete, 
others are merely in planning stages, while still others, 
such as a module to simulate cosmological evolution, 
exist in the code already but are not well maintained 
and may no longer be fully functional. VINE has also 
been the basis for an implementation of the SPH method 
on special purpose, reconfigurable hardware, so called 
FP GA (field Program mable Gate ^rray) boards (see 
e.g. iMarcus et alll2007| ). 

In current form, VINE's SPH module does not include 
strength and damage models needed to simulation sys- 
tems including solid bodies. In future work, we intend 
to integrate into VINE such a model, present in a much 
earlier cousin p'enz fc Asphaud 119951 ) ■ for use in mod- 
ern simulations of, e.g., giant impacts between planetary 
bodies or of asteroids on earth. These types of calcula- 
tions may also require different or more advanced treat- 
ments of the hydrodynamics, such as are available us- 
ing the Moving Least Squares Particle Hydrodynamics 
(MLSPH) approach of .Dilts (1999, ,2000.) . which may be 
substituted for VINE's standard SPH module. 

Other features of interest include alternate integra- 
tors and improvements to the physics modules that al- 
ready exist in the code. The VINE framework could, for 
example, be adapted to accommodate integrators spe- 
cially des igned for use i n highly accurate iV-body sim- 
ulations (jAarsethI 119991 : iLevison fc Dimc an' TgM ). A 
KDK variant of the leapfrog integrator fSpringel 2005[ ) 
is also of great interest, due to its stability properties 
and the fact that force calculations can be parallelized 
with higher efficiency when used in individual time step 
mode. Similarly, modifying the SPH module to incor- 
p orate an entropy conservin g formalism similar to that 
of iPrice fc Monaghan' (2004) will permit a more faithful 
reproduction of hydrodynamic features of the flow. 

VINE has been best tested on problems characterized 
by moderate particle counts (up to a few xlO^ — 10^) 
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but very large time step counts per particle during the 
simulation lifetime, by some fraction of the total num- 
ber of particles. Common examples of such problems are 
simulations of systems with widely varying dynamical 
time scales, such as occur in the evolution of circum- 
stellar disks or in galaxy collisions. Because the large 
time step counts occur for only the comparatively small 
fraction of particles defining the most dynamically active 
regions, only a small fraction of the particles require new 
force calculations at a given time and the code's parallel 
efficiency suffers. An important consequence of this ef- 
fect is that it frequently precludes simulations with very 
large particle counts, due to the accompanying increase 
in time to completion, and is a major motivation behind 
our effort to optimize VINE for speed and fine grained 
parallelism, rather than to explore its performance at 
more extreme computational scales. 

Indeed, at the largest scales where computing plat- 
forms largely consist of clusters of nodes each with their 
own private memory, VINE cannot be run at all in its 
current form. It will therefore be a priority for future 
versions of VINE to include a distributed memory layer, 
most likely using standard 'Message Passing Interface' 
(MPI) library functionality, in order to be more generally 
useful in cluster environments as well. The current ver- 
sion of VINE is best optimized and tested for moderate 
sized, shared memory machines with up to ~ 100 pro- 
cessors. Currently, multi-core processors and multi-chip 
nodes are becoming common in inexpensive, commodity 
computing environments. VINE will be well suited to 
these machines on problems up to at least tens of millions 
of particles. Larger scale shared memory machines are 
available in sizes up to hundreds to thousands of cores, 
and we expect that VINE will run unaltered on all or 
parts of such machines, though users may encounter is- 
sues on such scales that we have not explored. 

We expect that if VINE finds wide use in the astrophys- 
ical community, many other modules may be developed, 
beyond those suggested here. We hope that the code will 



be as useful to others in the astrophysical community as 
it has been for us so far. 

9.2. Availability of the code 

The code is available to the public under GNU General 
Public License version 2, via download from the ApJS 
website, directly from the authors or via download at the 
USM website: http://www.usm.lmu.de/people/mwetz 
and at both http://arXiv.org/abs/0802.4245 and 
http:/7arXiv.o rg/abs/0802.4253, 5y~ choosing the 
"Other t'ormats" sublink and then "Download Source". 
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of the National Nuclear Security Administration of the 
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implementation of the periodic boundaries. We thank 
Volker Springel for help with the Gadget-2 code as well 
as making a fix for a problem with the domain decom- 
position available to us. We thank Matthias Steinmetz 
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thank UKAFF for financial support. 



REFERENCES 



Aarseth, S. J. 1963, MNRAS, 126, 223 

— . 1999, PASP, 111, 1333 

Athanassoula, E. 2003, MNRAS, 341, 1179 

Athanassoula, E., Fady, E., Lambert, J. C, & Bosma, A. 2000, 

MNRAS, 314, 475 
Attwood, R. E., Goodwin, S., P., & Whitworth, A. P. 2007, a, 464, 

447 

Balsara, D. S. 1990, PhD thesis, University of lUinois 
— . 1995, J. Comp. Phys., 121, 357 

Bate, M. R., Bonnell, I. A., & Price, N. M. 1995, MNRAS, 277, 
362 

Bate, M. R. & Burkert, A. 1997, MNRAS, 288, 1060 
Bell, E. F., Naab, T., Mcintosh, D. H., Somerville, R. S., Caldwell, 
J. A. R., Barden, M., Wolf, C, Rix, H.-W., Beckwith, S. V., 
Borch, A., Haussler, B., Heymans, C, Jahnke, K., Jogee, S., 
Koposov, S., Meisenheimer, K., Peng, C. Y., Sanchez, S. F., & 
Wisotzki, L. 2006, ApJ, 640, 241 
Benz, W. 1988, Comp. Phys. Comm., 48, 97 

Benz, W. 1990, in Numerical Modelling of Nonlinear Stellar 
Pulsations: Problems and Prospects, ed. J. R. Buchler (Kluwer, 
Dordrecht), 269 

Benz, W. & Asphaug, E. 1995, Computer Physics Communications, 
87, 253 

Benz, W., Bowers, R. L., Cameron, A. G. W., & Press, W. H. 1990, 
ApJ, 348, 647 

Berczik, P., Merritt, D., Spurzem, R., & Bischof, H.-P. 2006, ApJ, 
642, L21 



Bonnell, I. A., Clarke, C. J., & Bate, M. R. 2006, MNRAS, 368, 
1296 

Burkert, A. & Naab, T. 2005, MNRAS, 363, 597 
Burkert, A., Naab, T., & Johansson, P. H. 2007, ArXiv e-prints, 
710 

Carraro, G., Lia, C, & Chiosi, C. 1998, MNRAS, 297, 1021 
Dasyra, K. M., Tacconi, L. J., Davies, R. I., Genzel, R., Lutz, D., 

Naab, T., Burkert, A., Veilleux, S., & Sanders, D. B. 2006a, ApJ, 

638, 745 

Dasyra, K. M., Tacconi, L. J., Davies, R. I., Naab, T., Genzel, R., 
Lutz, D., Sturm, E., Baker, A. J., Veilleux, S., Sanders, D. B., 
& Burkert, A. 2006b, ApJ, 651, 835 

Dave, R., Dubinski, J., & Hernquist, L. 1997, New Astronomy, 2, 
277 

Dehnen, W. 2001, MNRAS, 324, 273 

Dilts, G. A. 1999, Int, J. for Num. Meth. in Eng., 44, 1115 
— . 2000, Int, J. for Num. Meth. in Eng., 48, 1503 
Dolag, K., Vazza, F., Brunetti, G., & Tormen, G. 2005, MNRAS, 
364, 753 

Evrard, A. E. 1988, MNRAS, 235, 911 

Ewald, P. P. 1921, Ann. Phys., 64, 253 

Ewell, M. W. 1988, PhD thesis, Princeton University 

Fehlberg, E. 1968, NASA T.R., 287 

Fletcher, C. A. J. 1997, Computational Techniques for Fluid 
Dynamics, second edition edn. (Springer- Verlag) 

Frigo, M. & Johnson, S. G. 2005, Proceedings of the IEEE, 93, 
216, special issue on "Program Generation, Optimization, and 
Platform Adaptation" 



VINE - a new hybrid SPH / TV-body code 



31 



Pukushige, T., Ito, T., Makino, J., Ebisuzaki, T., Sugimoto, D., & 

Umemura, M. 1991, PASJ, 43, 841 
Pukushige, T., Makino, J., & Kawai, A. 2005, PASJ, 57, 1009 
Gingold, R. A. & Monaghan, J. J. 1977, MNRAS, 181, 375 
— . 1982, J. Comp. Phys., 46, 429 

Gritschneder, M., Naab, T., Burkert, A., Walch, S., Heitsch, P., & 

Wetzstein, M. 2009a, MNRAS, 393, 21 
Gritschneder, M., Naab, T., Walch, S., Burkert, A., & Heitsch, P. 

2009b, ArXiv e-prints 
Hernquist, L., Bouchet, P. R., & Suto, Y. 1991, ApJS, 75, 231 
Hernquist, L., Hut, P., & Makino, J. 1993, ApJ, 402, L85 
Hernquist, L. & Katz, N. 1989, ApJS, 70, 419 
Hockney, R. W. & Eastwood, J. W. 1981, Computer Simulations 

Using Particles (McGraw-Hill Inc.) 
Hockney, R. W. & Hohl, P. 1969, AJ, 74, 1102 
Hultman, J. & Kallander, D. 1997, A&A, 324, 534 
Ito, T., Ebisuzaki, T., Makino, J., & Sugimoto, D. 1991, PASJ, 43, 

547 

Jackson, J. D. 1975, Classical Electrodynamics (Chichester, 

Toronto: John Wiley & Sons) 
Jesseit, R., Naab, T., & Burkert, A. 2005, MNRAS, 360, 1185 
Jesseit, R., Naab, T., Pclcticr, R. P., & Burkert, A. 2007, MNRAS, 

376, 997 

Kawai, A., Pukushige, T., Taiji, M., Makino, J., & Sugimoto, D. 

2000, PASJ, 52, 659 
Khochfar, S. & Burkert, A. 2006, A&A, 445, 403 
Kitsionas, S., Klcsscn, R., Fcdcrrath, C, Schmidt, W., Price, D., 

Dursi, J., Gritschneder, M., Walch, S., Piontek, R., Kim, J., 

Jappscn, A. ., Ciecielag, P., & Mac Low, M. . 2008, ArXiv e- 

prints 

Klcsscn, R. 1997, MNRAS, 292, 11 

Kutta, W. 1901, Zeitschrift fiir Mathcmatik und Physik, 46, 435 
Lattanzio, J. C, Monaghan, J. J., Pongracic, H., & Schwaxz, M. P. 

1986, SIAM J. Sclent. Comp., 7, 591 
Levison, H. P. & Duncan, M. J. 1994, Icarus, 108, 18 
Lia, C. & Carraro, G. 2000, MNRAS, 314, 145 
Lombardi, J. C, Sills, A., Rasio, P. A., & Shapiro, S. L. 1999, J. 

Comp. Phys., 152, 687 
Lucy, L. B. 1977, AJ, 82, 1013 

Makino, J., Pukushige, T., Koga, M., & Namura, K. 2003, PASJ, 
55, 1163 

Makino, J., Ito, T., & Ebisuzaki, T. 1990, PASJ, 42, 717 
MaJcino, J. & Taiji, M. 1998, Scientific Simulations with Special- 
Purpose Computers : The GRAPE Systems (Chichester , 
Toronto: John Wiley & Sons) 
Makino, J., Taiji, M., Ebisuzaki, T., & Sugimoto, D. 1997, ApJ, 
480, 432 

Marcus, G., Licnhart, G., Kugel, A., Maenner, R., Bcrczik, P., 
Spurzem, R., Wetzstein, M., Naab, T., & Burkert, A. 2007, 
in "SPHERIC - Smoothed Particle Hydrodynamics European 
Research Interest Community"., 63 — h 

Merritt, D. 1996, AJ, 111, 2462 

Merritt, D. & SzcU, A. 2006, ApJ, 648, 890 

Monaghan, J. J. 1985, Comp. Phys. Rep., 3, 71 

— . 1988, Comp. Phys. Comm., 48, 89 

— . 1989, J. Comp. Phys., 82, 1 

— . 1992, ARA&A, 30, 543 

— . 2002, MNRAS, 335, 843 



Monaghan, J. J. & Gingold, R. A. 1983, J. Comp. Phys., 52, 374 
Monaghan, J. J. & Lattanzio, J. C. 1985, A&A, 149, 135 
Morris, J. M. & Monaghan, J. J. 1997, J. Comp. Phys., 136, 41 
Naab, T. & Burkert, A. 2003, ApJ, 597, 893 

Naab, T., Burkert, A., Johansson, P. H., & Jesseit, R. 2007, ArXiv 
e-prints, 709 

Naab, T., Jesseit, R., & Burkert, A. 2006a, MNRAS, 372, 839 
Naab, T., Khochfar, S., & Burkert, A. 2006b, ApJ, 636, L81 
Naab, T. & Trujillo, I. 2006, MNRAS, 369, 625 
Nelson, A. P. 2006, MNRAS, 373, 1039 

Nelson, A. P., Benz, W., Adams, P. C, & Arnett, D. 1998, ApJ, 
502, 342 

Nelson, A. P., Benz, W., & Ruzmaikina, T. V. 2000, ApJ, 529, 357 
Nelson, A. P., Wetzstein, M., & Naab. T. 2008, ApJS, 0, 
Nelson, R. P. & Papaloizou, J. C. B. 1993, MNRAS, 265, 905 
Nelson, R. P. & Papaloizou, J. C. B. 1994, MNRAS, 270, 1 
Okumura, S. K., Makino, ,1., Ebisuzaki, T., Pukushige, T., Ito, T., 

Sugimoto, D., Hashimoto, E., Tomida, K., & Miyakawa, N. 1993, 

PASJ, 45, 329 

O'Shea, B. W., Bryan, G., Bordner, J., Norman, M. L., Abel, 
T., Harkncss, K., & Kritsuk, A. 2004, in Adaptive Mesh 
Refinement - Theory and Applications, cd. T. Plewa, t. Linde, 
& V. Weirs, Lecture Notes in Computational Science and 
Engineering (Springer- Verlag, Berlin Heidelberg New York) 

O'Shea, B. W., Nagaminc, K., Springel, V., Hernquist, L., & 
Norman, M. L. 2005, ApJS, 160, 1 

Portegies Zwart, S. P., Baumgardt, H., McMillan, S. L. W., 
Makino, J., Hut, P., & Ebisuzaki, T. 2006, ApJ, 641, 319 

Porter, D. 1985, PhD thesis. University of California, Berkeley 

Press, W., Teukolsky, S. A., Vetterling, W., & Flannery, B. P. 
1992, Numerical Recipes, 2nd edn. (Cambridge University Press, 
Cambridge) 

Price, D. J. & Monaghan, J. J. 2004, MNRAS, 348, 139 
— . 2007, MNRAS, 374, 1347 
Rasio, P. A. & Shapiro, S. L. 1991, ApJ, 377, 559 
Romeo, A. B. 1997, A&A, 324, 523 

Rosswog, S., Davies, M. B., Thielemann, P.-K., & Piran, T. 2000, 

A&A, 360, 171 

Ryu, D., Ostriker, J. P., Kang, H., & Gen, R. 1993, ApJ, 414, 1 
Salmon, J. K. & Warren, M. S. 1994, J. Comp. Phys., Ill, 136 
Springel, V. 2005, MNRAS, 364, 1105 
Springel, V. & Hernquist, L. 2002, MNRAS, 333, 649 
Springel, V., Yoshida, N., & White, S. D. M. 2001, New Astronomy, 
6, 79 

Steinmetz, M. 1996, MNRAS, 278, 1005 

Steinmetz, M. & MuUer, E. 1993, A&A, 268, 391 

Stone, J. M. & Norman, M. L. 1992, ApJS, 80, 753 

Sugimoto, D., Chikada, Y., Makino, J., Ito, T., EbisuzaJfi, T., & 

Umemura, M. 1990, Nature, 345, 33 
Thacker, R. J., Tittley, E. R., Pearce, P. R., Couchman, H. M. P., 

& Thomas, P. A. 2000, MNRAS, 319, 619 
Thomas, J., Jesseit, R., Naab, T., Saglia, R. P., Burkert, A., & 

Bender, R. 2007, MNRAS, 381, 1672 
Wadsley, J. W., Stadel, J., & Quinn, T. 2004, New Astronomy, 9, 

137 

Wetzstein, M., Naab, T., & Burkert, A. 2007, MNRAS, 375, 805 



